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WHAT’S  SO  SPECIAL  ABOUT  KRUSKAL’S  THEOREM 

AND  THE  ORDINAL  Tq? 

A  SURVEY  OF  SOME  RESULTS  IN  PROOF  THEORY 

Jean  H.  Gallier 


Abstract:  This  paper  consists  primarily  of  a  survey  of  results  of  Harvey  Fried¬ 
man  about  some  proof  theoretic  aspects  of  various  forms  of  Kruskal’s  tree  the¬ 
orem,  and  in  particular  the  connection  with  the  ordinal  To-  We  also  include  a 
fairly  extensive  treatment  of  normal  functions  on  the  countable  ordinals,  and 
we  give  a  glimpse  of  Veblen  hierarchies,  some  subsystems  of  second-order  logic, 
slow-growing  and  fast-growing  hierarchies  including  Girard’s  result,  and  Good- 
stein  sequences.  The  central  theme  of  this  paper  is  a  powerful  theorem  due  to 
Kruskal,  the  “tree  theorem”,  as  well  as  a  “finite  miniaturization”  of  Kruskal’s 
theorem  due  to  Harvey  Friedman.  These  versions  of  Kruskal’s  theorem  are  re¬ 
markable  from  a  proof-theoretic  point  of  view  because  they  are  not  provable  in 
relatively  strong  logical  systems.  They  are  examples  of  so-called  “natural  inde¬ 
pendence  phenomena”,  which  are  considered  by  most  logicians  as  more  natural 
than  the  metamathematical  incompleteness  results  first  discovered  by  Godel. 
Kruskal’s  tree  theorem  also  plays  a  fundamental  role  in  computer  science,  be¬ 
cause  it  is  one  of  the  main  tools  for  showing  that  certain  orderings  on  trees  are 
well  founded.  These  orderings  play  a  crucial  role  in  proving  the  termination  of 
systems  of  rewrite  rules  and  the  correctness  of  Knuth-Bendix  completion  pro¬ 
cedures.  There  is  also  a  close  connection  between  a  certain  infinite  countable 


ordinal  called  Fq  and  Kruskal’s  theorem.  Previous  definitions  of  the  function  in¬ 
volved  in  this  connection  are  known  to  be  incorrect,  in  that,  the  function  is  not 
monotonic.  We  offer  a  repaired  definition  of  this  function,  and  explore  briefly 
the  consequences  of  its  existence. 
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1  Introduction 

This  paper  consists  primarily  of  a  survey  of  results  of  Harvey  Friedman  [47]  about  some 
proof  theoretic  aspects  of  various  forms  of  Kruskal’s  tree  theorem  [28],  and  in  particular  the 
connection  with  the  ordinal  Fq.  Initially,  our  intention  was  to  restrict  ourselves  to  Kruskal’s 
tree  theorem  and  Fq.  However,  as  we  were  trying  to  make  this  paper  as  self  contained  as 
possible,  we  found  that  it  was  necessary  to  include  a  fairly  extensive  treatment  of  normal 
functions  on  the  countable  ordinals.  Thus,  we  also  give  a  glimpse  of  Veblen  hierarchies, 
some  subsystems  of  second-order  logic,  slow-growing  and  fast-growing  hierarchies  including 
Girard’s  result,  and  Goodstein  sequences. 

The  central  theme  of  this  paper  is  a  powerful  theorem  due  to  Kruskal,  the  “tree 
theorem”,  as  well  as  a  “finite  miniaturization”  of  Kruskal’s  theorem  due  to  Harvey  Friedman. 
These  versions  of  Kruskal’s  theorem  are  remarkable  from  a  proof- theoretic  point  of  view 
because  they  are  not  provable  in  relatively  strong  logical  systems.  They  are  examples  of 
so-called  “natural  independence  phenomena”,  which  are  considered  by  most  logicians  as 
more  natural  than  the  metamathematical  incompleteness  results  first  discovered  by  Godel. 

Kruskal’s  tree  theorem  also  plays  a  fundamental  role  in  computer  science,  because  it 
is  one  of  the  main  tools  for  showing  that  certain  orderings  on  trees  are  well  founded.  These 
orderings  play  a  crucial  role  in  proving  the  termination  of  systems  of  rewrite  rules  and  the 
correctness  of  Knuth-Bendix  completion  procedures  [27]. 

There  is  also  a  close  connection  between  a  certain  infinite  countable  ordinal  called  Fq 
(Feferman  [13],  Schiitte  [46])  and  Kruskal’s  theorem.  This  connection  lies  in  the  fact  that 
there  is  a  close  relationship  between  the  embedding  relation  :<  on  the  set  T  of  finite  trees 
(see  definition  4.11)  and  the  well-ordering  <  on  the  set  0(Fo)  of  all  ordinals  <  Fq.  Indeed, 
it  is  possible  to  define  a  function  h  :  T  —y  C?(Fo)  such  that  h  is  (1).  surjective,  and  (2). 
preserves  order,  that  is,  if  s  :<t,  then  h{s)  <  h{t).  Previous  definitions  of  this  function  are 
known  to  be  incorrect,  in  that,  the  function  is  not  monotonic.  We  offer  a  repaired  definition 
of  this  function,  and  explore  briefly  the  consequences  of  its  existence. 

We  believe  that  there  is  a  definite  value  in  bringing  together  a  variety  of  topics  revolv¬ 
ing  around  a  common  theme,  in  this  case,  ordinal  notations  and  their  use  in  mathematical 
logic.  We  are  hoping  that  our  survey  will  help  in  making  some  beautiful  but  seemingly  rather 
arcane  tools  and  techniques  known  to  more  researchers  in  logic  and  theoretical  computer 
science. 

The  paper  is  organized  as  follows.  Section  2  contains  all  the  definitions  about  pre¬ 
orders,  well-founded  orderings,  and  well-quasi  orders  (WQO’s),  needed  in  the  rest  of  the 
paper.  Higman’s  theorem  for  WQO’s  on  strings  is  presented  in  section  3.  Several  versions 
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of  Kruskal’s  tree  theorem  are  presented  in  section  4.  Section  5  is  devoted  to  several  versions 
of  the  finite  miniaturization  of  Kruskal’s  theorem  due  to  Harvey  Friedman.  Section  6  is  a 
fairly  lengthy  presentation  of  bcisic  facts  about  the  countable  ordinals,  normal  functions, 
and  Fq.  Most  of  this  material  is  taken  from  Schiitte  [46],  and  we  can  only  claim  to  have 
presented  it  our  own  way,  and  hopefully  made  it  more  accessible.  Section  7  gives  a  glimpse 
at  Veblen  hierarchies.  A  constructive  system  of  notations  for  Fq  is  presented  in  section  8. 
The  connection  between  Kruskal’s  tree  theorem  and  Fq  due  to  Friedman  is  presented  in 
section  9.  A  brief  discussion  of  some  relevant  subsystems  of  second-order  arithmetic  occurs 
in  section  10.  An  introduction  to  the  theory  of  term  orderings  is  presented  in  section  11, 
including  the  recursive  path  ordering  and  the  lexicographic  path  ordering.  A  glimpse  at 
slow-growing  and  fast-growing  hierarchies  is  given  in  section  12.  Finally,  constructive  proofs 
of  Higman’s  lemma  are  briefly  discussed  in  section  13. 


2  Well  Quasi-Orders  (WQO’s) 

We  let  N  denote  the  set  {0, 1,2, . . .}  of  natural  numbers,  and  N-i-  denote  the  set  {1,2, . . .}  of 
positive  natural  numbers.  Given  any  n  €  N+,  we  let  [n]  denote  the  finite  set  {1,2, ...  ,n}, 
and  we  let  [0]  =  0.  Given  a  set  5,  a  finite  sequence  u  over  S,  or  siring  over  S,  is  a 
function  u  :  [n]  — >  5,  for  some  n  G  N.  The  integer  n  is  called  the  length  of  u  and  is 
denoted  by  |t£|.  The  special  sequence  with  domain  0  is  called  the  empty  sequence,  or  empty 
string,  and  will  be  denoted  by  e.  Strings  can  be  concatenated  in  the  usual  way:  Given 
two  strings  u  :  [m]  — >  S  and  u  :  [n]  ^  S,  their  concatenation  denoted  by  u.v  or  uv,  is 
the  string  uu  :  [m  -1-  n]  — >  5  such  that,  uv{i)  =  u{i)  if  1  <  z  <  m,  and  uv{i)  =  v{i  -  m) 
ifm-fl<z<m-t-n.  Clearly,  concatenation  is  associative  and  e  is  an  identity  element. 
Occasionally,  a  finite  sequence  u  of  length  n  will  be  denoted  as  (ui , . . .  ,  «„)  (denoting  u{i) 
as  Ui),  or  as  Ml  . .  Strings  of  length  1  are  identified  with  elements  of  S.  The  set  of  all 
strings  over  S  is  denoted  as  5*. 

An  infinite  sequence  is  a  function  5  :  N-j.  — >  S.  An  infinite  sequence  s  is  also  denoted 
by  (5i)i>i,  or  by  (si ,  ^2,  •  ■  •  7 -Si, . .  .)■  Given  an  infinite  sequence  s  =  an  mfimte 

subsequence  of  s  is  any  infinite  sequence  s'  —  such  that  there  is  a  strictly  monotonic 

function^  /  :  N-|.  — »•  N+,  and  s'-  —  for  all  i  >  0.  An  infinite  subsequence  s'  of  s 
associated  with  the  function  /  is  also  denoted  as  s'  = 

We  now  review  preorders  and  well-foundedness. 


Definition  2.1  Given  a  set  A,  a  binary  relation  :<  C  A  x  A  on  the  set  A  is  a  preorder 


^  A  function  /  :  N+ 

/(«■)  <  /(i). 


N4.  is  strictly  monotonic  (or  increasing)  iff  for  all  i,j  >  0,  i  <  j  implies  that 
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(or  quasi-order)  iff  it  is  reflexive  and  transitive.  A  preorder  that  is  also  antisymmetric  is 
called  a  partial  order.  A  preorder  is  total  iff  for  every  x,y  E  A,  either  x  :<  y  or  y  :<  x.  The 
relation  ^  is  defined  such  that  x  y  y  y  :<  x,  the  relation  -<  such  that 

X  y  iff  X  :<  y  and  y  ^  x, 

the  relation  >-  such  that  x  y  y  y  -<  x,  and  the  equivalence  relation  ^  such  that 

X  ^  y  iff  X  :<  y  and  y  x. 

We  say  that  x  and  y  are  incomparable  iff  x  2?  y  and  y  2?  x,  and  this  is  also  denoted  by  x  |  y. 

Given  two  preorders  and  :<2  on  a  set  A,  ;:^2  is  an  extension  of  iff  ::<]  C  <2.  In 
this  case,  we  also  say  that  is  a  restriction  of  :<2. 

Definition  2.2  Given  a  preorder  ■<  over  a  set  A,  an  infinite  sequence  (xi)i>i  is  an  infinite 
decreasing  chain  iff  x,  >-  x,q.i  for  all  i  >  1.  An  infinite  sequence  (xi)i>i  is  an  infinite 
antichain  iff  x,-  |  xj  for  all  i,j,  1  <  i  <  j.  We  say  that  is  well-founded  and  that  is 
Noetherian  iff  there  are  no  infinite  decreasing  chains  w.r.t.  X. 

We  now  turn  to  the  fundamental  concept  of  a  well  quasi-order.  This  concept  goes 
back  at  least  to  Janet  [23],  whose  paper  appeared  in  1920,  as  recently  noted  by  Pierre 
Lescanne  [31].  Irving  Kaplanski  also  told  me  that  this  concept  is  defined  and  used  in  his 
Ph.D  thesis  [25]  (1941).  The  concept  was  further  investigated  by  Higman  [22],  Kruskal  [28], 
and  Nash- Williams  [36],  among  the  forerunners. 

Definition  2.3  Given  a  preorder  -<  over  a  set  A,  an  infinite  sequence  (ai),>i  of  elements 
in  A  is  termed  good  iff  there  exist  positive  integers  z,  j  such  that  i  <  j  and  Ui  ■<  aj,  and 
otherwise,  it  is  termed  a  bad  sequence.  A  preorder  is  a  well  quasi-order,  abbreviated  as 
wqo,  iff  every  infinite  sequence  of  elements  of  A  is  good. 

Among  the  various  characterizations  of  wqo's,  the  following  ones  are  particularly  use¬ 
ful. 

Lemma  2.4  Given  a  preorder  X  on  a  set  A,  the  following  conditions  are  equivalent: 

1.  Every  infinite  sequence  is  good  (w.r.t.  ■<). 

2.  There  are  no  infinite  decreasing  chains  and  no  infinite  antichains  (w.r.t.  ■<). 

3.  Every  preorder  extending  (including  ■;<  itself)  is  well-founded. 
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Proof .  (1)  =>  (2).  Suppose  that  (x,),>i  is  an  infinite  sequence  over  A  such  that  x,  y 
for  all  z  >  1.  Hence,  for  every  z  >  1, 


^i+i  ^ii  Xi  (*) 

Since  ■<  satisfies  (1),  there  exist  some  integers  i,j  >  0  such  that  z  <  j  and  Xj  Xj.  If 
j  =  z  +  1,  this  contradicts  (*).  If  j  >  (z  +  1),  by  transitivity  of  since  Xj_i  Xi  ^  xj, 

we  have  xj-y  Xj,  contradicting  (*).  Hence  there  are  no  infinite  decreasing  sequences, 
that  is,  is  well-founded.  Also,  it  is  clear  that  the  existence  of  an  infinite  antichain  would 
contradict  (1). 

(2)  =>  (3).  This  proof  is  identical  to  the  first  part  of  the  proof  of  (1)  =>  (2). 

(3)  =>  (1).  If  (1)  fails,  then  there  is  some  infinite  sequence  s  =  (a:,),>i  such  that 

Xi  Xj  for  all  z,  j,  1  <  z  <  j.  But  then,  we  can  extend  :<  to  a  preorder  <'  such  that  s 
becomes  an  infinite  decreasing  chain  in  contradicting  (3).  □ 

It  is  interesting  to  observ'e  that  the  property  of  being  a  wqo  is  substanticilly  stronger 
that  being  well-founded.  Indeed,  it  is  not  true  in  general  that  any  preorder  extending  a 
given  well-founded  preorder  is  well-founded.  However,  by  (3)  of  lemma  2.4,  this  property 
characterizes  a  wqo.  Every  preorder  on  a  finite  set  (including  the  equality  relation)  is  a 
wqo,  and  by  (3)  of  lemma  2.4,  every  partial  ordering  that  is  total  and  well-founded  is  a  wqo 
(such  orderings  are  called  well- orderings). 

The  following  lemma  turns  out  to  be  the  key  to  the  proof  of  Kruskal’s  theorem.  It  is 
implicit  in  Nash-Williams  [36],  lemma  1,  page  833. 

Lemma  2.5  Given  a  preorder  on  a  set  A,  the  following  are  equivalent: 

(1)  is  a  wqo  on  A. 

(2)  Every  infinite  sequence  s  =  (5,)i>i  over  A  contains  some  infinite  subsequence  s'  — 

such  that  Sf(i')  :<  for  all  z  >  0. 

Proof .  It  is  clear  that  (2)  implies  (1).  Next,  assume  that  is  a  wqo.  We  say  that  a  member 
Si  of  a  sequence  s  is  terminal  iff  there  is  no  j  >  i  such  that  s,  -<  Sj.  We  claim  that  the 
number  of  terminal  elements  in  the  sequence  s  is  finite.  Otherwise,  the  infinite  sequence  t  of 
terminal  elements  in  s  is  a  bad  sequence  (because  if  the  sequence  t  was  good,  then  we  would 
have  Sh  Sk  for  two  terminal  elements  in  s,  contradicting  the  fact  that  Sh  is  terminal), 
and  this  contradicts  the  fact  that  is  a  wqo.  Hence,  there  is  some  N  >  0  such  that  s,  is 
not  terminal  for  every  z  >  N.  We  can  define  a  strictly  monotonic  function  /  inductively 
as  follows.  Let  /(I)  =  N,  and  for  any  z  >  1,  let  /(z  -|-  T)  be  the  least  integer  such  that 
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-s/(,)  /(*  +  1)  >  /(*)  (since  every  element  is  not  terminal  by  the  choice  of 

N  and  the  definition  of  /,  such  an  element  exists).  The  infinite  subsequence  s'  = 
has  the  property  stated  in  (2).  □ 

As  a  corollary  of  lemma  2.5,  we  obtain  another  result  of  Nash- Williams  [36].  Given 
two  preorders  and  (:^2)^2),  the  cartesian  product  A\  x  A2  is  equipped  with  the 

preorder  defined  such  that  (01,02)  d  (^^,02)  iff  oi  Oj  and  02  1^2  o-2- 

Lemma  2.6  If  :^i  and  ■<2  are  wqo,  then  X  is  a  wqo  on  Ai  x  A2. 

Proof.  Consider  any  infinite  sequence  s  in  Ai  x  A2.  This  sequence  is  formed  of  pairs 
(5[-,s'/)  G  Ai  X  A2,  and  defines  an  infinite  sequence  s'  =  (5'),>i  over  Ai  and  an  infinite 
sequence  s"  =  (s'/),>i  over  A2.  By  lemma  2.5,  since  ;:<i  is  a  wqo,  there  is  some  infinite 
subsequence  t'  =  (>s'^(^))t>i  of  s'  such  that  ::<i  for  all  i  >  0.  Since  :<2  is  also 

a  wqo  and  t"  =  (s'^(^))t>i  is  an  infinite  sequence  over  A2,  there  exist  some  z,j  such  that 
/(O  <  /(i)  and  :^2  Then,  we  have  (4(0’^/(«))  -  (•*/(»’ '^/O) 

the  sequence  s  is  good,  and  that  is  a  wqo.  □ 

In  turn,  lemma  2.6  yields  an  interesting  result  due  to  Dickson  [12],  published  in  1913! 

Lemma  2.7  Let  n  be  any  integer  such  that  n  >  1.  Given  any  infinite  sequence  {s,)i>i  of 
n-tuples  of  natural  numbers,  there  exist  positive  integers  i,j  such  that  i  <  j  and  Si  Sj, 
where  ■<„  is  the  partial  order  on  n-tuples  of  natural  numbers  induced  by  the  natural  ordering 
<  on  N. 

Proof .  The  proof  follows  immediately  by  observing  that  <  is  a  wqo  on  N  and  that  lemma 
2.6  extends  to  any  n  >  1  by  a  trivial  induction.  □ 

Next,  given  a  wqo  X  on  a  set  A,  we  shall  extend  X  to  the  set  of  strings  A*,  and  prove 
what  is  known  as  Higman’s  theorem  [22]. 

3  WQO’s  On  Strings,  Higman’s  Theorem 

Our  presentation  of  Higman’s  theorem  is  inspired  by  Nash- Williams’s  proof  of  a  similar 
theorem  ([36],  lemma  2,  page  834),  and  is  also  very  similar  to  the  proof  given  by  Steve 
Simpson  ([47],  lemma  1.6,  page  92).  Nash- Williams’s  proof  is  not  entirely  transparent,  and 
Simpson’s  proof  appeals  to  Ramsey’s  theorem.  Using  lemma  2.5,  it  is  possible  to  simplify 
the  proof.  A  proof  along  this  line  has  also  been  given  by  Jean  Jacques  Levy  in  some 
unpublished  notes  [33]  that  came  mysteriously  in  my  possession. 


Draft/September  30,  1993 


6 


WHAT’S  SO  SPECIAL  ABOUT  KRU SEAL’S  THEOREM? 


Definition  3.1  Let  C  be  a  preorder  on  a  set  A.  We  define  the  preorder  <  {siring  em¬ 
bedding)  on  A*  as  follows:  e  <C  u  for  each  u  e  A*,  and,  for  any  two  strings  u  =  U1U2  . .  .Um 

and  V  =  V\U2  . . .  Vn,  1  <  m  <  n, 

^1^2...  <  Uj  ^2  •  •  •  Vn 

iff  there  exist  integers  ji, . . .  Jm  such  that  1  <  ji  <  72  <  •  •  ■  <  jm-i  <  jm  <n  and 

E  1  •  •  •  1  E  • 

It  is  easy  to  show  that  <C  is  a  preorder,  and  we  leave  as  an  exercise  to  show  that  <  is 
a  partial  order  if  □  is  a  partial  order.  It  is  also  easy  to  check  that  <  is  the  least  preorder 
on  A*  satisfying  the  following  two  properties: 

(1)  (deletion  property)  uv  <C  uav,  for  all  u,v  €  .4*  and  a  e  .4; 

(2)  (monotonicity)  uav  <  ubv  whenever  a  Q  b,  for  all  u,v  E  A’  and  a,b  E  A. 

Theorem  3.2  (Higman)  If  C  is  a  wqo  on  A,  then  <C  is  a  wqo  on  ,4*. 

Proof.  Assume  that  is  not  a  wqo  on  A*.  Then,  there  is  at  least  one  bad  sequence  from 
A* .  Following  Nash-Williams,  we  define  a  minimal  bad  sequence  t  inductively  as  follows. 
Let  be  a  string  of  minimal  length  starting  a  bad  sequence.  If  t] , . . . ,  have  been  defined, 
let  tn+\  be  a  string  of  minimal  length  such  that  there  is  a  bad  sequence  whose  first  n 
elements  are  tj, . . .  ,t„.  Note  that  we  must  have  jt,|  >  1  for  all  i  >  1,  since  otherwise  the 
sequence  t  is  not  bad  (since  e  <  u  for  each  u  E  A*).  Since  |Li  >  1  for  all  i  >  1,  let 

it  — 

where  a,-  G  A  is  the  leftmost  symbol  in  t,-.  The  elements  a,  define  an  infinite  sequence 
a  =  (a,)i>i  in  A,  and  the  s,  define  an  infinite  sequence  s  =  (s,),>i  in  A*.  Since  C  is  a 
wqo  on  A,  by  lemma  2.5,  there  is  an  infinite  subsequence  a'  =  (a/(,)),>i  of  a  such  that 
o-f(i)  E  <^/(i+i)  for  all  i  >  0.  We  claim  that  the  infinite  subsequence  s'  =  (sy(,)),>i  of  s  is 
good.  Otherwise,  if  s'  =  (•s/(i)),>i  is  bad,  there  are  two  cases. 

Case  1:  /(I)  =  1.  Then,  the  infinite  sequence  s'  =  (■5/(,))i>]  is  a  bad  sequence  with 
1^1 1  <  |ti|,  contradicting  the  minimality  of  i. 

Case  2:  /(I)  >  1.  Then,  the  infinite  sequence 

is  also  bad,  because  tk  -  a^Sk  for  all  k  >  1  and  t,  <  implies  that  t,  <  by  the 

definition  of  <C.  But  |•S/(l)|  <  |t/(i)|,  and  this  contradicts  the  minimality  of  t. 
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Since  the  sequence  s'  —  is  good,  there  are  some  positive  integers  i,j  such 

that  f{i)  <  f{j)  and  s^(,)  <C  Since  the  infinite  sequence  a'  =  (ay(,)),>i  was  chosen 

such  that  for  all  i  >  0,  by  the  definition  of  <,  we  have 

a/(,)5/(,)  <  o/o)Sy(_,), 

that  is,  <C  t/(j)  (since  tk  =  ajt-s/t  for  all  k  >  1).  But  this  shows  that  the  sequence  t  is 
good,  contradicting  the  initial  assumption  that  t  is  bad.  □ 

A  theorem  similar  to  theorem  3.2  applying  to  finite  subsets  of  A  can  be  shown.  Fol¬ 
lowing  Nash- Williams  [36],  let  T{S)  denote  the  set  of  all  finite  subsets  of  S.  Given  any 
two  subsets  A,  5  of  5,  a  function  f  :  A  —y  B  is  non- descending  if  a  C  /(a)  for  every 
a  e  A.  The  set  iF(S)  is  equipped  with  the  preorder  <C  defined  as  follows:  0  <C  A  for  every 
A  e  iF{S),  and  for  any  two  nonempty  subsets  A,Be  ^{S),  A  <  B  iff  there  is  an  injective 
non-descending  function  f  :  A  B.  The  proof  of  theorem  3.2  can  be  trivially  modified  to 
obtain  the  following. 

Theorem  3.3  (Nash-Williams)  If  C  is  a  wqo  on  A,  then  C  is  a  wqo  on  JF(A). 

We  now  turn  to  trees. 

4  WQO’s  On  Trees,  Kruskal’s  Tree  Theorem 

First,  we  review  the  definition  of  trees  in  terms  of  tree  domains. 

Definition  4.1  A  tree  domain  D  is  a  nonempty  subset  of  strings  in  Nllj.  satisfying  the 
conditions: 

(1)  For  all  It,  u  £  N!].,  if  uv  E  D  then  u  £  D. 

(2)  For  all  u  £  N!].,  for  every  i  £  N+,  if  ui  £  P  then,  for  every  j,l  <j  <  i,  uj  £  D. 

The  elements  of  D  are  called  tree  addresses  or  nodes.  We  now  consider  labeled  trees. 

Definition  4.2  Given  any  set  S  of  labels,  a  T,-tree  (or  term)  is  any  function  t  :  D  —y  T,, 
where  P  is  a  tree  domain  denoted  by  dom{t). 

Hence,  a  labeled  tree  is  defined  by  a  tree  domain  P  and  a  labeling  function  t  with 
domain  P  and  range  S.  The  tree  address  e  is  called  the  root  of  t,  and  its  label  i(e)  is 
denoted  as  root{t).  A  tree  is  finite  iff  its  domain  is  finite.  In  the  rest  of  this  paper,  only 
finite  trees  will  be  considered.  The  set  of  all  finite  E- trees  is  denoted  as  Te. 
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WHAT’S  SO  SPECIAL  ABOUT  KRUSKAL’S  THEOREM? 


Definition  4.3  Given  a  (finite)  tree  t,  the  number  of  tree  addresses  in  dom{t)  is  denoted 
by  |t|.  The  depth  of  a  tree  t  is  defined  a5  depth{t)  =  max({|a|  |  u  6  dom{t)}).  The  number 
of  immediate  successors  of  the  root  of  a  tree  is  denoted  by  rank(t),  and  it  is  defined  formally 
as  the  number  of  elements  in  the  set  {t  |  i  G  N^.  and  i  G  dom{t)}.  Given  a  tree  t  and  some 
tree  address  u  G  dom{t),  the  subtree  oft  rooted  at  u  is  the  tree  t/u  whose  domain  is  the  set 
{u  I  uu  G  dom{t)}  and  such  that  t/u{v)  —  t(uv)  for  all  v  in  dom{t/u). 

A  tree  t  such  that  remkft)  =  0  is  a  one-node  tree,  and  if  root{t)  =  t  will  also 
be  denoted  by  /.  Given  any  A'  >  1  trees  ti,...,tk  and  any  element  /  G  S,  the  tree 
t  =  f(ti ,tk)  is  th('  tree  whose  domain  is  the  set 


i=k 

{e}  U  [J  {n/  I  u  G  dom[t,)], 

1=1 


and  whose  labeling  function  is  defined  such  that  t(e)  =  /  and  t{ni)  =  t,(u),  for  u  G  dom{ti), 
k  <  i  <  k.  It  is  well  known  that  every  finite  tree  t  is  either  a  one-node  tree,  or  can  be 
written  uniquely  as  /  =  f{tjl, . . .  ,t/k),  where  /  =  root{e),  and  k  -  rankft).  It  is  also 
convenient  to  introduce  the  following  abbreviations.  Let  C  be  a  binary  relation  on  trees. 
Then 


is  an  abbreviation  for  s  □  f{si , . . . ,  s,_i ,  s,  6,+] , . . . ,  Sn ), 


/(...)C/(...,s,...) 

is  an  abbreviation  for  y^(si,...,s,_i,  ^  f  1 1  t  —  i  ^  \  i 

IS  an  abbreviation  for  y(s],...,Sj_j,s,5j-j-i,,..,Sn)  E  —  iit,S|.|_j,.,.,s^),  for 

some  trees  s,t,  sj , . . . ,  s,_] ,  , . . . ,  1  <  z  <  n.  When  n  =  1,  these  are  understood  as 

5  E  f{s),  f  E  f{s),  and  f{s)  E  g{t). 

4.1  Kruskal’s  Theorem,  Version  1 

Assuming  that  E  is  preordered  by  Ei  we  define  a  preorder  :<  on  E- trees  extending  E  in  the 
following  way. 

Definition  4.4  Assume  that  E  is  a  preorder  on  E.  The  preorder  :<  on  Te  {homeomorphic 
embedding)  is  defined  inductively  as  follows:  Either 
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(1)  /  :<  ••  ,<n)  iff  /  E  i?;  or 

(2)  s  :<  g{. . .  . . .)  s  :<  t\  ox 

(3)  f{s\ ,Sm)  9{^\i  •  •  •  1  ^n)  iff  /  E  S'?  and  there  exist  some  integers  j\,. . .  ,jm  such 

that  1  <  ii  <  j2  <  ■  ■  ■  <  jm-i  <  jm  <  n,  1  <  m  <  n,  and 

^  ^jl  ?  •  •  •  1  ijrr,  • 

Note  that  (1)  can  be  viewed  as  the  special  case  of  (3)  for  which  m  =  0,  and  n  =  0 
is  possible.  It  is  easy  to  show  that  is  a  preorder.  One  can  also  show  that  is  a 
partial  order  if  C  is  a  partial  order.  This  can  be  shown  by  observing  that  s  :<  t  implies 
that  depth{s)  <  depth{t).  Hence,  if  s  t  and  t  :<  s,  we  have  depth{s)  —  depth{t)  and 
rank{s)  =  rank{t)  (since  only  case  (1)  or  (3)  can  apply).  Then,  we  can  show  that  s  =  t  by 
induction  on  the  depth  of  trees. 

It  is  also  easy  to  show  that  the  preorder  can  be  defined  as  the  least  preorder 
satisfying  the  following  properties: 

(1)  s  /(...,s,...); 

(2)  /(...)d/(...,^,...); 

(3)  /(...,  s, .. .)  5'(. . . ,  t, . . .)  whenever  f  Q  g  and  s  :<t. 

We  now  prove  a  version  of  Kruskal’s  theorem  [28]. 

Theorem  4.5  (Kruskal’s  tree  theorem)  If  C  is  a  wqo  on  E,  then  ■<  is  a  wqo  on  T^. 

Proof .  Assume  that  is  not  a  wqo  on  Ts.  As  in  the  proof  of  theorem  3.2,  we  define  a 
minimal  bad  sequence  t  of  elements  of  Te  satisfying  the  following  properties: 

(i)  |fi|  <  1^1 1  for  all  bad  sequences  f'; 

(ii)  |<n+i|  <  for  all  bad  sequences  t'  such  that  t'-  =  ti,  1  <  i  <  n. 

We  claim  that  |t,|  >  2  for  all  but  finitely  many  z  >  1.  Otherwise,  the  sequence  of 
one- node  trees  in  t  must  be  infinite,  and  since  C  is  a  wqo,  by  clause  (1)  of  the  definition  of 
there  are  i,j  >  0  such  that  i  <  j  and  ti  :<  tj,  contradicting  the  fact  that  t  is  bad. 

Let  s  =  be  the  infinite  subsequence  of  t  consisting  of  all  trees  having  at  least 

two  nodes,  and  let  /  =  {fi)i>i  be  the  infinite  sequence  over  E  defined  such  that  /,•  =  roo<(s,  ) 
for  every  z  >  1.  Since  □  is  a  wqo  over  E,  by  lemma  2.5,  there  is  some  infinite  subsequence 
f  =  iUi))  ,>i  of  /  such  that  /<^(,)  C  fip(i+i)  for  all  z  >  1.  Let 

T>  =  I  z  >  1,  1  <  i  <  ranfc(s^(i))}. 
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WHAT’S  SO  SPECIAL  ABOUT  KRUSKAL’S  THEOREM? 


We  claim  that  is  a  wqo  on  V.  Otherwise,  let  r  =  (ri,r2, . . .  ,rj, . . .)  be  a  bad  sequence 
in  T>.  Because  r  is  bad,  it  contains  a  bad  subsequence  r'  =  (cj ,  r^, . . .  ,  r' , . . .)  with  the 
following  property:  if  i  <  j,  then  r-  is  a  subtree  of  a  tree  tp  and  r'  is  a  subtree  of  a  tree 
tq  such  that  p  <  q.  Indeed,  every  t,  only  has  finitely  many  subtrees,  and  r  being  bad  must 
contain  an  infinite  number  of  distinct  trees.  Thus,  we  consider  a  bad  sequence  r  with  the 
additional  property  that  if  i  <  j,  then  is  a  subtree  of  a  tree  tp  and  rj  is  a  subtree  of  a 
tree  tq  such  that  p  <  q.  Let  n  be  the  index  of  the  first  tree  in  the  sequence  t  such  that 
tn/j  —  ri  for  some  j.  If  n  =  1,  since  |ri  |  <  |f]|  and  the  sequence  r  is  bad,  this  contradicts 
the  fact  that  t  is  a  minimal  bad  sequence.  If  n  >  1,  then  the  sequence 

{^1  till  •  •  •  ifn  —  ^2  1  ■  •  •  )  1  •  •  • ) 

is  bad,  since  by  clause  (ii)  of  the  definition  of  ::<,  for  any  k  s.t.  I  <  k  <  n  —  I,  tf;  :<  rj 
would  imply  that  tk  th  for  some  th  and  /  such  that  —  th/l  and  k  <  h,  since  each  r,  is  a 
subtree  of  some  tp  such  that  n  —  1  <  p.  But  since  |ri  |  <  |tn|,  this  contradicts  the  fact  that 
t  is  a  minimal  bad  sequence.  Hence,  T>  is  a  wqo. 

By  Higman’s  theorem  (theorem  3.2),  the  string  embedding  relation  <C  extending  the 
preorder  on  I?  is  a  wqo  on  V*  .  Hence,  considering  the  infinite  sequence  over  V* 

■  ■  ■  ’'^¥’(1) /'^o.nk(s^p(^  i))}t  •  •  •  1  •  •  •  '  ^  ^ij)  /  ^  1  •  •  •)  1 

there  exist  some  i,j  >  0  such  that,  letting  m  =  ranA:{s^(,))  and  n  =  ra?rA'(s^(j) ), 

/^1  /“^i  •  •  •  1  /  '^¥’(  j)  /“'  •  •  •  ’  /^)  ' 

that  is,  there  are  some  positive  integers  ji  <  j2  <  •  •  •  <  jm-i  <  jm  ^  n  such  that 

^¥5(i)/I  ^  ’  •••’  !  Jm- 

Since  we  also  have  /^(,)  E  by  clause  (3)  of  the  definition  of  we  have  ■< 

But  s  is  a  subsequence  of  t,  and  this  contradicts  the  fact  that  t  is  bad.  Hence,  is  a  wqo 

on  Ts.  □ 

The  above  proof  is  basically  due  to  Nash- Williams. 

4.2  Kruskal’s  Theorem,  Version  2 

Another  version  of  Kruskal’s  theorem  that  assumes  a  given  preorder  on  Tv  (and  not  just 
E)  can  also  be  proved.  This  version  (found  in  J.J.  Levy’s  unpublished  notes  [33])  can  be 
used  to  show  that  certain  orderings  on  trees  are  well-founded. 


Draft/September  30,  1993 


WQO^s  On  Trees,  Kruskal’s  Tree  Theorem 


11 


Definition  4.6  Assume  that  □  is  a  preorder  on  Ts.  The  preorder  :<  on  Te  is  defined 
inductively  as  follows:  Either 

(1)  /  iff  /  E  or 

(2)  s  gf(. . .  . . .)  iff  s  or 

(3)  s  =  f{si ,Sm)  giti , . . . ,  tn)  =  f  iff  ■s  E  t,  and  there  exist  some  integers  ji, . . .  ,jm 
such  that  1  <  ji  <  j2  <  ■  ■  ■  <  jm-i  <  jm  n,  1  <  m  <  n,  and 

^1  tjj  ,  .  .  .  ,  Sm  ^  . 

It  is  easy  to  show  that  ^  is  a  preorder.  It  can  also  be  shown  that  ;:<  is  a  partial  order 
if  □  is  a  partial  order.  Again,  (1)  can  be  viewed  as  the  special  case  of  (3)  for  which  m  =  0 
and,  n  =  0  is  possible.  It  is  also  easy  to  see  that  :<  can  be  defined  as  the  least  preorder 
satisfying  the  following  properties: 

(1)  5  X  /(...,s,...); 

(2)  s  =  f(si, . . .  ,Sm)  d  gi^ii  ■  •  •  I'tn)  =  i  whenever  s  Qt  and  there  exist  some  integers 
ii,  •  •  •  Jm  such  that  1  <  ii  <  j2  <  ...  <  jm-i  <  jm  <  n,  1  <  m  <  n,  and 

•^1  —  ^ji  )  •  •  •  )  ijm  • 

We  can  now  prove  another  version  of  Kruskal’s  theorem. 

Theorem  4.7  (J.J.  Levy)  If  C  is  a  wqo  on  Te,  then  :<  is  a  wqo  on  Te. 

Proof .  Assume  that  X  is  not  a  wqo  on  Te-  As  in  the  proof  of  theorem  4.5,  we  find  a 
minimal  bad  sequence  t  of  elements  of  Te  . 

Since  C  is  a  wqo,  there  is  some  infinite  subsequence  t'  —  of  t  such  that 

^xp(i)  E  for  a-ll  *  ^  1-  We  claim  that  |t^(j)|  >  2  for  all  but  finitely  many  i  >  1. 

Otherwise,  the  sequence  of  one- node  trees  in  t'  must  be  infinite,  and  since  C  is  a  wqo,  by 
clause  (1)  of  the  definition  of  X,  there  are  i,j  >  0  such  that  ip{i)  <  tp{j)  and  ■< 
contradicting  the  fact  that  t  is  bad. 

Let  s  =  be  the  infinite  subsequence  of  t'  consisting  of  all  trees  having  at  least 

two  nodes.  Since  s  is  a  subsequence  of  t'  and  t'  is  a  subsequence  of  t,  s  is  a  subsequence  of 
t  of  the  form  s  =  for  some  strictly  monotonic  function  (p.  Let 

=  {t^(i)/j  h'  >  1,  1  <  i  <  rank{t^(^i))}. 

As  in  the  proof  of  theorem  4.5,  we  can  show  that  is  a  wqo  on  V. 
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WHAT’S  SO  SPECIAL  ABOUT  KRUSKAL’S  THEOREM? 


By  Higman’s  theorem  (theorem  3.2),  the  string  embedding  relation  -C  extending  the 
preorder  on  2?  is  a  wqo  on  T>* .  Hence,  considering  the  infinite  sequence  over  P* 

'^¥’(i)/2,  •  •  • ,  I .  •  • ,  (t,^( j)/l,  iip(j) /2, . . . ,  tifi(j) 
there  exist  some  i,j  >  0  such  that,  letting  m  =  rank(t^^(^i))  and  n  =  rank{t^(^j)), 

•  •  •  ,  ’  ^v(j)  /^)  ’ 

that  is,  there  are  some  positive  integers  ji  <  j2  <  ■  ■  ■  <  jm-i  <  jm  <  n  such  that 

•••1  lip(i)  /  ^  I  I^{i)  /  J  m  ■ 

Since  we  also  have  C  (because  s  =  (t<^(i))j>i  is  also  a  subsequence  of  t'  =  ),>] 

and  t^(i)  E  for  all  i  >  1),  by  clause  (3)  of  the  definition  of  we  have  t^{t)  ^<p(j)- 

But  this  contradicts  the  fact  that  t  is  bad.  Hence,  is  a  wqo  on  T^.  □ 

This  second  version  of  Kruskal’s  theorem  (theorem  4.7)  actually  implies  the  first  ver¬ 
sion  (theorem  4.5).  Indeed,  if  □  is  a  preorder  on  S,  we  can  extend  it  to  a  preorder  on  Tz 
by  requiring  that  s  E  ^  iff  root(s)  E  root{t).  It  is  easy  to  check  that  with  this  definition  of 
E,  definition  4.6  reduces  to  4.4,  and  that  theorem  4.7  is  indeed  theorem  4.5. 

Kruskal’s  theorem  has  been  generalized  in  a  number  of  ways.  Among  these  general¬ 
izations,  we  mention  some  versions  using  unavoidable  sets  of  trees  due  to  Fuel  [43,  44],  and 
a  version  using  well  rewrite  orderings  due  to  Lescanne  [30]. 

4.3  WQO’s  and  Well-Founded  Preorders 

This  second  version  of  Kruskal’s  theorem  also  has  the  following  applications.  Recall  that 
from  lemma  2.4  a  wqo  is  well-founded.  The  following  proposition  is  very  useful  to  prove 
that  orderings  on  trees  are  well-founded. 

Proposition  4.8  Let  ■<  be  a  preorder  on  Tz  and  let  <  be  another  preorder  on  Tz  such 
that: 

(1)  If  /  <  . .  ,tn),  then  /  <  g{tu  ■■■  ,!»)■■, 

(2)  s  <  /(..., s,...); 

(3)  li  f(si,. . .  ,s,n)  <  giti,-  ■■  ^tn),  and  si  <  for  some  such 

that  1  <  ii  <  . . .  <  jm  <  n,  then  . .  ,Sm)  <  fi'(ti,  •  •  ■  Jn)- 

If  <C  is  a  wqo,  then  <  is  a  wqo. 
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Proof .  Let  X  be  the  preorder  associated  with  <C  as  in  definition  4.6.  Then,  an  easy  induction 
shows  that  the  conditions  of  the  proposition  imply  that  C  <.  By  theorem  4.7,  since  <C  is 
a  wqo,  is  also  a  wqo,  which  implies  that  <  is  a  wqo.  By  lemma  2.4,  <  is  well-founded.  □ 

The  following  proposition  also  gives  a  sufficient  condition  for  a  preorder  on  trees  to 
be  well-founded. 

Proposition  4.9  Assume  S  is  finite,  and  let  <  be  a  preorder  on  Te  satisfying  the  following 
conditions: 

(1)  5  <  /(...,s,...); 

(2)  s  <  t  implies  that  /(...,  s, .. .)  </(...,  t, .. .); 

(3)  /{...)< 

Then,  <  is  well-founded. 

Proof .  Let  <C  be  the  preorder  on  Ts  defined  such  that  5  <C  t  iff  root{s)  =  root{t).  Since  S 
is  finite,  -C  is  a  wqo.  Since  it  is  clear  that  and  <  satisfy  the  conditions  of  proposition 
4.8,  <  is  well-founded.  □ 

Proposition  4.8  can  be  used  to  show  that  certain  orderings  on  trees  are  well-founded. 
These  orderings  play  a  crucial  role  in  proving  the  termination  of  systems  of  rewrite  rules 
and  the  correctness  of  Knuth-Bendix  completion  procedures.  An  introduction  to  the  theory 
of  these  orderings  will  be  presented  in  section  11,  and  for  more  details,  the  reader  is  referred 
to  the  comprehensive  survey  by  Dershowitz  [7]  and  to  Dershowitz’s  fundamental  paper  [8]. 

It  is  natural  to  ask  whether  there  is  an  analogue  to  Kruskal’s  theorem  with  respect  to 
well-founded  preorders  instead  of  wqo.  Indeed,  it  is  possible  to  prove  such  a  theorem,  using 
Kruskal’s  theorem. 

Theorem  4.10  If  C  is  a  well-founded  preorder  on  T^,  then  is  well-founded  on  Ts. 

Proof .  The  proof  is  implicit  in  Levy  [33],  Dershowitz  [8],  and  Lescanne  [29].  Unfortunately, 
one  cannot  directly  apply  theorem  4.7,  since  C  is  not  necessarily  a  wqo.  However,  there 
is  a  way  around  this  problem.  We  use  the  fact  that  every  well-founded  preorder  C  can  be 
extended  to  a  total  well-founded  preorder  <.  This  fact  can  be  proved  rather  simply  using 
Zorn’s  lemma.  The  point  is  that  <  being  total  and  well-founded  is  also  a  wqo.  Now,  we 
can  apply  theorem  4.7  since  <  is  a  wqo  on  T^,  and  so  ;^<  is  a  wqo  on  T^,  and  thus  it  is 
well-founded.  Finally,  we  note  that  :;<<  contains  which  proves  that  :<  is  well-founded.  □ 

Exercise:  Find  a  proof  of  theorem  4.10  that  does  not  use  Zorn’s  lemma  nor  Kruskal’s 
theorem. 
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WHAT’S  SO  SPECIAL  ABOUT  KRUSKAL’S  THEOREM? 


4.4  Kruskal’s  Theorem,  A  Special  Version 

Kruskal’s  tree  theorem  is  a  very  powerful  theorem,  and  we  state  more  interesting  conse¬ 
quences.  We  consider  the  case  where  E  is  a  finite  set  of  symbols. 

Definition  4.11  The  preorder  on  is  defined  inductively  as  follows:  Either 

(1)  /  , . . . ,  t„),  for  every  /  €  E;  or 

(2)  5  iff  s  t;  or 

(3)  . . .  ,Sm)  d:  f{i\ ,  •  •  •  ,  tn)  iff  1  <  rn  <  n,  and  there  exist  some  integers  , .  .  .  ,  j,„ 

such  that  1  <  <  j2  <  ...  <  i„,-i  <  jm  <  n  and 

•S]  tj,  ,  •  •  • ,  Sm  • 

Again,  (1)  can  be  viewed  as  the  special  case  of  (3)  in  which  m  =  0.  For  example, 

f{f{h,h),h{a,b))  X  h(f{g{f{h{b),a,h{b))),g{a),  h{h(a,h,c)))). 

It  is  also  easy  to  show  that  the  preorder  ■<  can  be  defined  as  the  least  preorder 
satisfying  the  following  properties; 

(1)  5  /(...,5,...); 

(2)  /(...):<  f{...,s,...y, 

(3)  /(...,  s, .. .)  whenever  s  :<  t. 

Kruskal’s  theorem  implies  the  following  result. 

Theorem  4.12  Given  a  finite  alphabet  E,  is  a  wqo  on  T^. 

Proof .  Since  any  preorder  on  a  finite  set  is  a  wqo,  the  identity  relation  on  E  is  a  viqo.  But 
then,  it  is  trivial  to  verify  that  the  preorder  of  definition  4.11  is  obtained  by  specializing 
C  to  the  identity  relation  in  definition  4.4.  Hence,  the  theorem  is  direct  a  consequence  of 
theorem  4.5.  □ 

In  particular,  when  E  consists  of  a  single  symbol,  we  have  the  well-known  version 
of  Kruskal’s  theorem  on  unlabeled  trees  [28],  except  that  in  Kruskcil’s  paper,  the  notion 
of  embedding  is  defined  as  a  certain  kind  of  function  between  tree  domains.  We  find  it 
more  convenient  to  define  the  preorder  ■<  inductively,  as  in  definition  4.4.  For  the  sake  of 
completeness,  we  present  the  alternate  definition  used  by  Simpson  [47]. 
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First,  given  a  partial  order  <  on  a  set  A,  given  any  nonempty  subset  S  of  A,  we  say  that 

<  is  a  total  order  on  S  iff  for  all  x,  y  G  5,  either  x  <  y,  or  y  <  x.  We  also  say  that  5  is  a 
chain  (under  <). 

Definition  4.13  A  finite  tree  domain  is  a  nonempty  set  D  together  with  a  partial  order 

<  satisfying  the  following  properties; 

(1)  D  has  a  least  element  .L  (with  respect  to  <). 

(2)  For  every  x  e  D,  the  set  anc(x)  =  {y  e  D  \  y  <  x]  oi  ancestors  of  x  is  a  chain  under 

<. 

Clearly  _L  corresponds  to  the  root  of  the  tree,  and  for  every  x  E  D,  the  set  anc{x)  = 
{y  eD  I  y  <  x}  is  the  set  of  nodes  in  the  unique  path  from  the  root  to  x.  The  main  difference 
between  definition  4.1  and  definition  4.13  is  that  independent  nodes  of  a  tree  domain  as 
defined  in  definition  4.13  are  unordered,  and,  in  particular,  the  immediate  successors  of  a 
node  are  unordered. 

Given  any  two  elements  x,y  E  D,  the  greatest  element  of  the  set  anc{x)  fl  anc{y)  is 
the  greatest  lower  bound  of  x  and  y,  and  it  is  denoted  as  x  Ay.  It  is  the  “lowest”  common 
ancestor  of  x  and  y.  A  (labeled)  tree  is  defined  as  in  definition  4.2,  but  using  definition 
4.13  for  that  of  a  tree  domain.  The  notion  of  an  embedding  (or  homeomorphic  embedding) 
is  then  defined  as  follows.  Let  S  be  a  set  with  some  preorder  C. 

Definition  4.14  Given  any  two  trees  t\  and  <2  with  tree  domains  (Di,  <i)  and  (D2,  <2)5 
an  embedding  h  from  <1  to  <2  is  an  injective  function  h  :  {D\,  <i)  — >  (Z?2,  <2)  such  that: 

(1)  h{x  A  y)  =  h{x)  A  h{y),  for  all  x,y  G  Di. 

(2)  fi(x)  □  t2(h{x)),  for  every  x  E  D\. 

It  is  easily  shown  that  h  is  monotonic  (choose  x,y  such  that  x  <1  y).  One  can  verify 
that  when  the  immediate  successors  of  a  node  are  ordered,  definition  4.4  is  equivalent  to 
definition  4.14. 

Next,  we  shall  consider  an  extremely  interesting  version  of  Kruskal’s  theorem  due  to 
Harvey  Friedman.  A  complete  presentation  of  this  theorem  and  its  ramifications  is  given 
by  Simpson  [47]. 
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5  Friedman’s  Finite  Miniaturization  of  Kruskal’s  Theorem 

Friedman’s  version  of  Kruskal’s  theorem,  which  has  been  called  a  finite  miniaturization 
of  Kruskal’s  theorem,  is  remarkable  from  a  proof-theoretic  point  of  view  because  it  is  not 
provable  in  relatively  strong  logical  systems.  Actually,  Kruskal’s  original  theorem  is  also 
not  provable  in  relatively  strong  logical  systems,  but  Kruskal’s  version  is  a  second-order 
statement  (a  IlJ  statement,  meaning  that  it  is  of  the  form  VA’A,  where  A"  is  a  second-order 
variable  ranging  over  infinite  sequences  and  A  is  first-order),  whereas  Friedman’s  version 
is  a  first-order  statement  (a  II®  statement,  meaning  that  it  is  of  the  form  VxByA,  where  A 
only  contains  bounded  first-order  quantifiers). 

From  now  on,  we  assume  that  E  is  a  finite  alphabet,  and  vve  consider  the  embedding 
preorder  of  definition  4.11. 

Theorem  5.1  (Friedman)  Let  E  be  a  finite  set.  For  every  integer  k  >  1,  there  exists 
some  integer  n  >  2  so  large  that,  for  any  finite  sequence  (fi, . . .  of  trees  in  Tg  with 

|^m|  <  k{m  -f  1)  for  all  m,  1  <  m  <  n,  there  exist  some  integers  i,j  such  that  1  <  i  <  j  <  n 
and  ti  ■<  tj. 

Proof .  Following  the  hint  given  by  Simpson  [47],  we  give  a  proof  using  theorem  4.12  and 
Konig’s  lemma.  Assume  that  the  theorem  fails.  Let  us  say  that  a  finite  sequence  (tj , . . . ,  f^) 

such  that  |tm|  <  k{m  -f  1)  for  all  m,  1  <  m  <  n,  is  good  iff  there  exist  some  integers  i,j  such 

that  1  <  i  <  j  <  n  and  t,  ■<  tj,  and  otherwise,  that  it  is  bad.  Then,  there  is  some  A;  >  1 

such  that  for  all  n  >  1,  there  is  some  bad  sequence  (<],...,<„)  (and  \tm  |  <  k{m  -|- 1)  for  all 

m,  \  <  m  <  n).  Observe  that  any  initial  subsequence  (fi, . . .  ,fj),  j  <  n,  of  a  bad  sequence 
is  also  bad.  Furthermore,  the  size  restriction  (|tm|  <  k{m  +  1)  for  all  m,  1  <  m  <  n)  and 
the  fact  that  E  is  finite  implies  that  there  are  only  finitely  many  bad  sequences  of  length  n. 
Hence,  the  set  of  finite  bad  sequences  can  be  arranged  into  an  infinite  tree  T  as  follows:  the 
root  of  T  is  the  empty  sequence,  and  every  finite  bad  sequence  t  is  connected  to  the  root  by 
the  unique  path  consisting  of  all  the  initial  subsequences  of  t.  From  our  previous  remark, 
this  infinite  tree  is  finite-branching.  By  Konig’s  lemma,  this  tree  contains  an  infinite  path 
s.  But  since  all  finite  initial  subsequences  of  s  are  bad,  s  itself  is  bad,  and  this  contradicts 
theorem  4.12.  □ 

A  stronger  version  of  the  previous  theorem  also  due  to  Friedman  holds. 

Theorem  5.2  (Friedman)  Let  E  be  a  finite  set.  For  every  integer  k  >  2,  there  exists  some 
integer  n  >  2  so  large  that,  for  any  finite  sequence  (<],...,<„)  of  trees  in  Td  with  \tm  \  <  rn 
for  all  m,  1  <  m  <  n,  there  exist  some  integers  f], . . . ,  ik  such  that  1  <  <  . . .  <  u  <  n 

and  ti,  ■  ■  ■  d:  tii^ . 
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Proof .  The  proof  is  very  similar  to  that  of  theorem  5.1,  but  lemma  2.5  also  needs  to  be 
used  at  the  end.  □ 

Note  that  theorems  5.1  and  5.2  are  both  of  the  form  Wk3nA{k,n),  where  A{k,n)  only 
contains  bounded  quantifiers,  that  is,  they  axe  II”  statements.  Hence,  each  statement  defines 
a  function  Fr,  where  Fr{k)  is  the  least  integer  n  such  that  \/k3nA{k,n)  holds. 

One  may  ask  how  quickly  this  function  grows.  Is  it  exponential,  super  exponential, 
or  worse?  Well,  this  function  grows  extremely  fast.  It  grows  faster  than  Ackermann’s 
function,  and,  even  though  it  is  recursive,  it  is  not  provably  total  recursive  in  fairly  strong 
logical  theories,  including  Peano’s  arithmetic.  We  will  consider  briefiy  hierarchies  of  fast¬ 
growing  functions  in  section  12.  For  more  details,  we  refer  the  reader  to  Cichon  and  Wainer 
[4],  Wainer  [54],  and  to  Smorynski’s  articles  [50,51]. 

The  other  remarkable  property  of  the  two  previous  theorems  is  that  neither  is  provable 
in  fairly  strong  logical  theories  {ATRq,  see  section  10).  The  technical  reason  is  that  it 
is  possible  to  define  a  function  mapping  finite  trees  to  (rather  large)  countable  ordinals, 
and  this  function  is  order  preserving  (between  the  embedding  relation  on  trees  and  the 
ordering  relation  on  ordinals).  This  is  true  in  pzirticular  for  the  ordinal  ITo  (see  Schiitte  [46], 
chapters  13,  14).  For  further  details,  see  the  articles  by  Simpson  and  Smorynski  in  [21].  We 
shall  present  the  connection  with  Fq  in  sections  9  and  10. 
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6  The  Countable  Ordinals 

In  this  section,  we  gather  some  definitions  and  results  about  the  countable  ordinals  needed 
to  explain  what  Fq  is.  This  ordinal  plays  a  central  role  in  proof  theoretic  investigations  of 
a  subsystem  of  second-order  arithmetic  known  as  “predicative  analysis”,  which  has  been 
studied  extensively  by  Feferman  [13]  and  Schiitte  [46].  Schiitte’s  axiomatic  presentation  of 
the  countable  ordinals  ([46],  chapters  13,  14)  is  particularly  convenient  (and  elegant),  and 
we  follow  it.  Most  proofs  are  omitted.  They  can  be  found  in  Schiitte  [46]. 

6.1  A  Preview  of  Fq 

Proof  theorists  use  (large)  ordinals  in  inductive  proofs  establishing  the  consistency  of  cer¬ 
tain  theories.  In  order  for  these  proofs  to  be  as  constructive  as  possible,  it  is  crucial  to 
describe  these  ordinals  using  systems  of  constructive  ordinal  notations.  One  way  to  obtain 
constructive  ordinal  notation  systems  is  to  build  up  inductively  larger  ordinals  from  smaller 
ones  using  functions  on  the  ordinals.  For  example,  if  O  denotes  the  set  of  countable  or¬ 
dinals,  it  is  possible  to  define  two  functions  -t-  and  a  (where  u  is  the  least  infinite 

ordinal)  generalizing  addition  and  exponentiation  on  the  natural  numbers.  Due  to  a  result 
of  Cantor,  for  every  ordinal  a  €  O,  if  a  >  0,  there  are  unique  ordinals  Qi  >  ...  >  On, 
n  >  1,  such  that 

a  =  w"*  +  •  •  •  +  u;®" .  (♦) 

This  suggests  a  constructive  ordinal  notation  system.  Define  C  to  be  the  smallest  set  of 
ordinals  containing  0  and  closed  under  +  and  a  i— >  u;". 

Do  we  have  C  =  O?  The  answer  is  no.  Indeed,  strange  things  happen  with  infinite 
ordinals.  For  some  ordinals  a,^  such  that  0  <  a  <  /?,  we  can  have  a  +  /3  =  and  even 
=  a\ 

An  ordinal  /?  >  0  such  that  a  +  0  =  0  for  all  a  <  0  is  called  an  additive  principal 
ordinal.  It  can  be  shown  that  an  ordinal  is  an  additive  principal  ordinal  iff  it  is  of  the  form 
for  some  tj. 

The  general  phenomenon  that  we  are  witnessing  is  the  fact  that  if  a  function  f  :  O  O 
satisfies  a  certain  continuity  condition,  then  it  has  fixed  points  (an  ordinal  o  is  a  fixed  point 
of  /  iff  /(a)  -  cr). 

The  least  ordinal  such  that  a;°  =  a  (the  least  fixed  point  of  q  i— >  cj")  is  denoted  by  eo, 
and  C  provides  a  constructive  ordinal  notation  system  for  the  ordinals  <  eo .  The  main  point 
here,  is  that  for  every  ordinal  a  <  eo,  we  can  guarantee  that  a,  <  q  in  the  decomposition 
(*)• 
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Unfortunately  cq  is  too  small  for  our  purpose  (which  is  to  relate  the  embedding  relation 
on  finite  trees  with  the  ordering  on  To).  To  go  beyond  cq,  we  need  functions  more  powerful 
than  a  a;“.  Such  a  hierarchy  (v5a)aeo  can  be  defined  inductively,  starting  from  a  ^  lo°. 

We  let  (fo  be  the  function  a  u",  and  for  every  a  >  Q,  ipa  •  O  O  enumerates  the 
common  fixed  points  of  the  functions  for  all  yd  <  n  (the  ordinals  ry  such  that  =  rj 

for  all  yd  <  a). 

Then,  we  have  a  function  p  :  O  x  O  O,  defined  such  that  =  Pa{/3)  for  all 

a, yd  G  O.  Note,  <^(1,0)  =  cq! 

The  function  p  has  lots  of  fixed  points.  We  can  have  p{a,l3)  =  /?,  in  which  case  yd 
is  called  an  a-critical  ordinal,  or  <y?(Q;,0)  =  a  (but  we  can’t  have  p{a,(3)  =  a  for  yd  >  0). 
Ordinals  such  that  v?(a,0)  =  a  are  called  strongly  critical. 

It  can  be  shown  that  for  every  additive  principal  ordinal  j  =  u)^,  there  exist  unique 

а, /?  with  a:  <  7  and  yd  <  7,  such  that  7  =  p{oi,/3).  But  we  can’t  guarantee  that  a  <  7, 
because  p{a,  0)  =  o  when  a  is  a  strongly  critical  ordinal.  This  is  where  To  comes  in! 

The  ordinal  To  is  the  least  ordinal  such  that  p{o(,  0)  =  a  (the  least  strongly  critical 
ordinal).  It  can  be  shown  that  for  all  a, yd  <  To,  we  have  a  +  yd  <  To  and  p{a,lS)  <  To,  and 
also  that  for  every  additive  principal  ordinal  7  <  To,  7  =  p{a,l3)  for  unique  ordinals  such 
that  both  a  <  7  and  yd  <  7.  This  fact  together  with  the  Cantor  normal  form  (*)  yields  a 
constructive  ordinal  notation  system  for  the  ordinals  <  To  described  in  the  sequel. 

The  reason  why  we  were  able  to  build  the  hierarchy  {pq)q£o  is  that  these  functions 
satisfy  certain  conditions:  they  are  increa.sing  and  continuous.  Such  functions  are  called 
normal  functions.  What  is  remarkable  is  that  the  function  (y?(— ,  0)  is  also  a  normal  function, 
and  so,  it  is  possible  to  repeat  the  previous  hierarchy  construction,  but  this  time,  starting 
from  p(—,  0).  But  there  is  no  reason  to  stop  there,  and  we  can  continue  on  and  on  . . .! 

We  have  what  is  called  a  Veblen  hierarchy  [53].  However,  this  is  going  way  beyond 
the  scope  of  these  notes  (transfinitely  beyond!).  The  intrigued  reader  is  referred  to  a  paper 
by  Larry  Miller  [34]. 

б. 2  Axioms  for  the  Countable  Ordinals 

Recall  that  a  set  A  is  countable  iff  either  A  =  %  ox  there  is  a  surjective  (onto)  function 
/  :  N  ^  A  with  domain  N,  the  set  of  natural  numbers.  In  particular,  every  finite  set  is 
countable. 

Given  a  set  A  and  a  partial  order  <  on  A,  we  say  that  A  is  well-ordered  by  <  iff  every 
nonempty  subset  of  A  has  a  least  element. 
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This  definition  implies  that  a  well-ordered  set  is  totally  ordered.  Indeed,  every  subset 
{x,y}  of  A  consisting  of  two  elements  has  a  least  element,  and  so,  either  x  <  y  or  y  <  x. 

We  say  that  a  subset  5  C  A  of  is  strictly  bounded  iff  there  is  some  b  ^  A  such 
that  X  <  b  for  all  x  E  S  (recall  that  x  <  y  iff  a:  <  y  and  x  ^  y).  A  subset  5  of  A  that  is 
not  strictly  bounded  is  called  unbounded .  The  set  of  countable  ordinals  is  defined  by  the 
following  axioms. 

Definition  6.1  A  set  O  together  with  a  partial  order  <  on  O  satisfies  the  axioms  for  the 
countable  ordinals  iff  the  following  properties  hold: 

(1)  (9  is  well-ordered  by  <. 

(2)  Every  strictly  bounded  subset  of  O  is  countable. 

(3)  Every  countable  subset  of  O  is  strictly  bounded. 

Applying  axiom  (3)  to  the  empty  set  (which  is  a  subset  of  O),  we  see  that  O  is 
nonempty.  Applying  axiom  (1)  to  O,  we  see  that  O  has  a  least  element  denoted  by  0. 
Repeating  this  argument,  we  see  that  O  is  infinite.  However,  O  is  not  countable.  Indeed  if 
O  was  countable,  by  axiom  (3),  there  would  be  some  a  E  O  such  that  /3  <  a  for  all  (3  E  O, 
which  implies  a  <  a,  a  contradiction. 

It  is  possible  to  show  that  axioms  (l)-(3)  define  the  set  of  countable  ordinals  up  to 
isomorphism.  From  now  on,  the  elements  of  the  set  O  will  be  called  ordinals  (strictly 
speaking,  they  should  be  called  countable  ordinals). 

Given  a  property  P(x)  of  the  set  of  countable  ordinals,  the  principle  of  transfinite 
induction  is  the  following: 

•  If  P(0)  holds,  and 

•  for  every  a  ^  O  such  that  a  >  0,  V/3(/3  <  a  D  P{(3))  implies  P(q),  then 

•  P(7)  holds  for  all  7  E  O. 

We  have  the  following  fundamental  metatheorem. 

Theorem  6.2  The  principle  of  transfinite  induction  is  valid  for  O. 

Proof.  Assume  that  the  principle  of  transfinite  induction  does  not  hold.  Then,  P(0) 
holds,  for  every  a  e  O  such  that  a  >  0,  V/?(^  <  a  D  P{(3))  implies  P(a),  but  the  set 
W  =  {a  E  O  I  P(o)  =  false)  is  nonempty.  By  axiom  (1),  this  set  has  a  least  element  7. 
Clearly,  7  7^  0,  and  P(y3)  must  hold  for  all  ^  <  7,  since  otherwise  7  would  not  be  the  least 
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element  of  W.  Hence,  V/S  <  'yP(/3)  holds,  and  from  above,  this  implies  that  Pij)  holds, 
contradicting  the  definition  of  7.  □ 

By  axioms  (1)  and  (3),  for  every  ordinal  a,  there  is  a  smallest  ordinal  ^  such  that 
a  <  ^.  Indeed,  the  set  {a}  is  countable,  hence  by  axiom  (3)  the  set  {/?  €  O  [  a  <  /?}  is 
nonempty,  and  by  axiom  (1),  it  has  a  least  element.  This  ordinal  is  denoted  by  a',  and  is 
called  the  successor  of  a.  We  have  the  following  properties: 

a  <  OL 

a  <  ^  OL  <  (3 
a  <  /3'  a  <  /3. 

An  ordinal  jS  is  called  a  successor  ordinal  iff  there  is  some  a  G  O  such  that  (3  =  a' .  A  limit 
ordinal  is  an  ordinal  that  is  neither  0  nor  a  successor  ordinal. 

Given  any  countable  subset  M  C  O,  by  axiom  (3),  the  set  {a  ^  O  \  V/?  £  M{l3  <  a)} 
is  nonempty,  and  by  axiom  (1),  it  has  a  least  element.  This  ordinal  denoted  by  [JM  is  the 
least  upper  hound  of  M,  and  it  satisfies  the  following  properties: 

a  €  M  a  <  | _ |M 

a  <  ^  for  all  Q  €  M  =»  I  \M  <  0 

0  <^M  3a  £  M  such  that  /?  <  a. 

We  have  the  following  propositions. 

Proposition  6.3  If  M  is  a  nonempty  countable  subset  of  O  and  M  has  no  maximal 
element,  then  [JM  is  a  limit  ordinal. 

Proposition  6.4  For  all  a,  ^  £  O,  if  7  <  for  all  7  <  a,  then  a  <  /3. 

Proof.  The  proposition  is  clear  if  a  =  0.  If  a  is  a  successor  ordinal,  a  =  S'  for  some  S, 
and  since  6  <  a,  by  the  hypothesis  we  have  S  <  /3,  which  implies  a  =  S'  <  ^.  If  a  is  a 
limit  ordinal,  we  prove  that  a  =  U{7  £  |  7  <  a},  which  implies  that  a  <  I3,  since  by  the 

hypothesis  is  an  upper  bound  of  the  set  {7  £  |  7  <  a}.  Let  S  =  ^{7  £  O  1  7  <  a). 

First,  it  is  clear  that  a  is  an  upper  bound  of  the  set  {7  £  C?  |  7  <  a},  and  so  6  <  a.  If 
S  <  a,  since  a  is  a  limit  ordinal,  we  have  S'  <  a,  contradicting  the  fact  that  S  is  the  least 
upper  bound  of  the  set  {7  £  O  |  7  <  n}.  Hence,  ^  =  a.  □ 

Definition  6.5  The  set  N  of  finite  ordinals  is  the  smallest  subset  of  O  that  contains  0 
and  is  closed  under  the  successor  function. 

It  is  not  difficult  to  show  that  N  is  countable  and  has  no  maximal  element.  The  least 
upper  bound  of  N  is  denoted  by  a;. 


Draft/Septemher  30,  1993 


22 


WHAT’S  SO  SPECIAL  ABOUT  KRUSKAL’S  THEOREM? 


Proposition  6.6  The  ordinal  u  is  the  least  limit  ordinal.  For  every  a  £  O,  a  <  ui 

O'  e  N. 

It  is  easy  to  see  that  limit  ordinals  satisfy  the  following  property:  For  every  limit 
ordinal  ^ 

a  <  l3  ^  q'  <  ^. 


6.3  Ordering  Functions 

Given  any  ordinal  q  G  O,  let  0{a)  be  the  set  {0  E  O  \  (S  <  a).  Clearly,  0(0)  =  0, 
0(u!)  =  N,  and  by  axiom  (2),  each  0{a)  is  countable. 

Deflnition  6.7  A  subset  A  C  O  is  an  O-segmeni  iff  for  all  o, /9  G  O,  if  /d  6  A  and  a  <  /d, 
then  a  G  A. 

The  set  O  itself  is  an  O-segment,  and  an  O-segment  which  is  a  proper  subset  of  O  is 
called  a  proper  O-segmeni.  It  is  easy  to  show  that  A  is  a  proper  0-segment  iff  A  =  0(n) 
for  some  a  G  O. 

We  now  come  to  the  crucial  concept  of  an  ordering  function. 

Definition  6.8  G  iven  a  subset  B  C  O,  a  function  /  :  A  — »•  B  is  an  ordering  function  for 
B  iff: 

(1)  The  domain  of  /  is  an  O-segment. 

(2)  The  function  /  is  strictly  monotonic  (or  increasing),  that  is,  for  all  q, /?  G  O,  if  a  <  /?, 
then  /(a)  <  /(^). 

(3)  The  range  of  /  is  B. 

Intuitively  speaking,  an  ordering  function  /  of  a  set  B  enumerates  the  elements  of 
the  set  B  in  increasing  order.  Observe  that  an  ordering  function  /  is  bijective,  since  by 
(3),  /(A)  =  B,  and  by  (2),  /  is  injective.  Note  that  the  ordering  function  for  the  empty 
set  is  the  empty  function.  The  following  fundamental  propositions  are  shown  by  transfinite 
induction. 

Proposition  6.9  If  /  :  A  — >  B  is  an  ordering  function,  then  a  <  /(a  )  for  all  a  G  A 

Proof .  Clearly,  0  <  /(O).  Given  any  ordinal  a  >  0,  for  every  /?  <  a,  by  the  induction 
hypothesis,  0  <  /(/?).  Since  /  is  strictly  monotonic,  /(/3)  <  /(a).  Hence,  (d  <  f{a)  for  all 
/d  <  a,  and  by  proposition  6.4,  this  implies  that  q  <  /(o).  □ 
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Proposition  6.10  Every  subset  B  CO  has  at  most  one  ordering  function  f  :  A  —*  B. 

Proof .  Let  /,  :  A,  z  =  1,2,  be  two  ordering  functions  for  B.  We  show  by  transfinite 

induction  that,  if  o  G  Ai,  then  a  C  A2  and  /i(a)  =  f2ioi).  If  5  =  0,  then  clearly 
/i  =  /2  :  0  0-  Otherwise,  since  Ai  and  A2  are  O-segments,  0  G  Ai  and  0  G  A2.  Since 

/2  is  surjective,  there  is  some  a  G  ^2  such  that  /2(a;)  =  /i(0).  By  (strict)  monotonicity 
of  /2,  we  have  /2(0)  <  /i(0).  Similarly,  since  fi  is  surjective,  there  is  some  ^  E  Ai  such 
that  fi{/3)  =  /2(0),  and  by  (strict)  monotonicity  of  /i,  we  have  /i(0)  <  /2(0).  Hence 
/j(0)  =  /2(0).  Now,  assume  a  >  0.  Since  /2  is  surjective,  there  is  some  /?  G  ^2  such 
that  f2i/^)  =  /i(«)-  If  <  o,  since  Ai  is  an  O-segment,  ^  G  ^1,  and  by  the  induction 
hypothesis,  ^  E  A2  and  /i(^)  =  /2(/5)-  By  strict  monotonicity,  /2(/5)  =  /i(/5)  <  f\{oi),  a 
contradiction. 

Hence,  13  >  a,  and  since  A2  is  an  O-segment  and  ^  G  ^2,  we  have  o  G  ^2-  Assume 
(3  >  a.  By  strict  monotonicity,  f2{oi)  <  f2{^)-  Since  /i  is  surjective,  there  is  some  7  G  Ai 
such  that  fii'y)  =  /2(<a;).  Since  /2(a)  =  fiil),  /2(^)  =  /i(Q!)i  and  /2(a)  <  /2(^),  we 
have  /i(7)  <  /i(a)-  By  strict  monotonicity,  we  have  7  <  a.  By  the  induction  hypothesis, 
/i(7)  =  /2(7),  and  since  /i(7)  =  /2(o),  then  72(7)  =  /2(o).  Since  /2  is  injective,  we  have 
a  =  7,  a  contradiction.  Hence,  a  =  0  and  fi{a)  =  /2(a).  Therefore,  we  have  shown  that 
Ai  C  A2  and  for  every  a  G  Ai,  /i(o)  =  /2(a).  Using  a  symmetric  argument,  we  can  show 
that  A2  C  Ai  and  for  every  0  G  A2,  /i(o;)  =  /2(tt).  Hence,  Aj  =  A2  and  /i  =  /2-  □ 

Given  a  set  H  C  O,  for  every  0  E  B,  let  B{0)  =  {7  G  B  |  7  <  0}.  Sets  of  the  form 
B{0)  are  called  proper  segments  of  B.  Observe  that  B{0)  =  B  f)  O{0).  Using  proposition 
6.10,  we  prove  the  following  crucial  result. 

Proposition  6.11  Every  subset  B  C  O  has  a  unique  ordering  function  f  :  A  B. 

Proof .  First,  the  following  claim  is  shown. 

Claim:  If  every  proper  segment  B{0)  oi  a.  set  B  C  O  has  an  ordering  function,  then  B  has 
an  ordering  function. 

Proof  of  claim.  The  idea  is  to  construct  a  function  g  :  B  ^  O  and  to  show  that  g  is  strictly 
monotonic  and  that  its  range  is  an  d-segment.  Then,  the  inverse  of  g  is  an  ordering  function 
for  B.  By  the  hypothesis,  for  every  0  E  B,  we  have  an  ordering  function  f^  :  A^  B{0)  for 
each  proper  segment  B{0)  of  B.  By  axiom  (2)  (in  definition  6.1),  B{0)  is  countable.  Since 
fp  is  bijective,  A;j  is  also  countable,  and  therefore,  it  is  a  proper  d-segment.  Hence,  for 
every  0  E  B,  there  is  a  unique  ordinal  7  such  that  A^  =  d(7),  and  we  define  the  function 
g  :  B  O  such  that  g{0)  =  7. 

We  show  that  g  is  strictly  monotonic.  Let  /?i  <  ^2 ,  ,  i^2  G  B.  Since  the  function 

•  ^{9(^2))  B{02)  is  surjective  and  0i  E  B{02)  (since  0i  <  02  and  02  E  B),  there  is 
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some  a  <  g{^2)  such  that  Observe  that  the  restriction  of  to  0{a)  is  an 

ordering  function  of  B(/5i).  Since  B{Pi)  is  also  an  ordering  function  for  5(/?i), 

by  proposition  6.10,  0{a)  =  0{g{j3i))^  and  therefore,  g{^\)  =  a  <  <7(/?2)- 

We  show  that  g{B)  is  an  d-segment.  We  have  to  show  that  for  every  7  G  g{B),  if 
a  <  7,  then  a  6  giB).  Let  /I  £  B  such  that  7  =  g{^)-  Since  fp  :  d(<7(/?))  — >  jB(/?)  and 
ot  <  5'(/?),  fp{oi)  =  for  some  /?o  G  B{P).  The  restriction  of  fp  to  0(a)  is  an  ordering 
function  of  B(^o).  Since  fp^  :  0{g(^o))  B{^o)  is  also  an  ordering  function  for  B{(3o),  by 

proposition  6.10,  a  =  g{/Io),  and  therefore  a  G  g{B). 

Since  the  function  g  :  B  O  is  strictly  monotonic  and  g{B)  is  an  d-segment,  say  A, 
its  inverse  g~^  :  A  —*  B  is  an  ordering  function  for  B.  This  proves  the  claim.  □ 

Let  B  C  O.  For  every  (3  ^  B,  note  that  every  proper  segment  of  B{0)  is  of  the  form 
B(/3o)  for  some  /5o  <  Using  the  previous  claim,  it  follows  by  transflnite  induction  that 
every  proper  segment  B(/3)  of  B  has  an  ordering  function.  By  the  claim,  B  itself  has  an 
ordering  function.  By  proposition  6.10,  this  function  is  unique.  □ 

An  important  property  of  ordering  functions  is  continuity. 

Definition  6.12  A  subset  B  C  d  is  closed  iff  for  every  countable  nonempty  set  M, 

M  CB  \JM  G  B. 

An  ordering  function  f  :  A  B  is  continuous  iff  A  is  closed  and  for  every  nonempty 
countable  set  MCA, 

/(□  JW)  =  □ /(M). 

Proposition  6.13  The  ordering  function  /:  A  15  of  a  set  5  is  continuous  iff  B  is 
closed. 

Proof .  Let  /  :  A  — >  j5  be  the  ordering  function  of  B.  First,  assume  that  /  is  continuous. 
Since  /  is  bijective,  for  every  nonempty  countable  subset  M  C  B,  there  is  some  nonempty 
countable  subset  U  C  A  such  that  f{U)  =  M.  Since  /  is  continuous,  /(|J  U)  —  LJ/(^)  = 
IJM,  and  therefore  |JM  G  /(A)  =  B,  and  B  is  closed. 

Conversely,  assume  that  B  is  closed.  Let  U  C  A  be  a  nonempty  countable  subset 
of  A.  Since  /  is  bijective,  f{U)  is  a  nonempty  countable  subset  of  B.  Since  B  is  closed, 
\_jfiU)  G  B.  Since  B  =  /(A),  there  is  some  a  G  A  such  that  /(o)  =  U/(^)-  Since 
fi^)  —  U/(^))  for  every  6  C  U,  we  have  f{6)  <  /(a),  and  by  strict  monotonicity  of  /, 
this  implies  that  6  <  a.  Hence  (JU  <  a.  Since  A  is  an  d-segment,  [J  U  G  A.  Hence,  A  is 
closed.  For  all  <5  G  U,  6  <  [J  U,  and  so  f{S)  <  /([J  U).  Then,  /((J  U)  is  an  upper  bound  for 
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/(?/),  and  so  |J/(^)  ^  /(U^)-  Also,  since  \_\U  <  a,  we  have  <  f{a)  =  \_}f{U). 

But  then,  \_\f{U)  —  /(|J  U),  and  /  is  continuous.  □ 

An  ordering  function  that  is  continuous  and  whose  domain  is  the  entire  set  O  is  called 
a  normal  function.  Normal  functions  play  a  crucial  role  in  the  definition  of  Fq. 

Proposition  6.14  The  ordering  function  /  :  A  — >  B  of  a  set  B  is  a  normal  function  iff  B 
is  closed  and  unbounded. 

Proof .  By  axiom  (2)  and  (3)  (in  definition  6.1),  a  subset  M  of  O  is  bounded  iff  it  is 
countable.  Since  an  ordering  function  f  :  A  B  is  bijective,  it  follows  that  B  is  unbounded 
iff  A  is  unbounded.  But  A  is  an  O-segment,  and  O  is  the  only  unbounded  C)-segment  (since 
a  proper  O-segment  is  bounded).  Hence,  the  ordering  function  /  has  domain  C?  iff  B  is 
unbounded.  This  together  with  proposition  6.13  yields  proposition  6.14.  □ 

We  now  show  that  normal  functions  have  fixed  points. 

Proposition  6.15  Let  f  :  O  O  he  &  continuous  function.  For  every  a  £  O,  let 
f°(a)  =  a,  and  /""’‘^(a)  =  /(/"(a))  for  all  n  >  0.  If  a  <  /(a)  for  every  a  £  (9,  then 
U„>o/”(<^)  is  the  least  fixed  point  of  /  that  is  >  a,  and  Uti>o/"(^^)  i®  least  fixed 
point  of  /  that  is  >  a. 

Proof .  First,  observe  that  a  continuous  function  is  monotonic,  by  applying  the  continuity 
condition  to  each  set  {a,/?}  with  a  <  /3.  Since  /  is  continuous, 

n>0  n>0 

=  U 

n>0 

=  U 

n>l 

=  U 

n>0 

since  a  <  f{a).  Hence,  1Jti>o/"(^)  i®  ^  fixed  point  of  /  that  is  >  a.  Let  ^  be  any  fixed 
point  of  /  such  that  a  <  /3.  We  show  by  induction  that  f'^{a)  <  For  n  =  0,  this  follows 
from  the  fact  that  f^{oi)  =  a  and  the  hypothesis  o  <  /?.  If  f'^{ot)  <  /?,  since  /  is  monotonic 
we  have,  /(/’"(n))  <  /(/?),  that  is,  /"‘‘'^(a;)  <  since  /"+Hq!)  =  /(/"(a))  and  /(/?)  =  ^ 
(because  ^  is  a  fixed  point  of  /).  Hence,  lJ„>o  ^  P-,  which  shows  that  |J„>o  /”(o^)  is 

the  least  fixed  point  of  /  that  is  >  a. 

From  above,  |J„>o  /”(®^)  is  the  least  fixed  point  of  /  that  is  >  a',  and  since  0  >  a' 
iff  /?  >  a,  the  second  part  of  the  lemma  holds.  □ 
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Corollary  6.16  For  every  normal  function  /,  for  every  a  e  O,  Un>o/"(^) 
fixed  point  of  /  that  is  >  a,  and  lJn>o  /"(‘^^)  the  least  fixed  point  ol"  /  that  is  >  a. 

Proof .  Since  a  normal  function  is  continuous  and  a  <  f(a)  for  all  a,  the  corollary  follows 
from  proposition  6.15.  □ 

Using  the  concept  of  a  normal  function,  we  are  going  to  define  addition  and  exponen¬ 
tiation  of  ordinals. 

6.4  Addition  and  Exponentiation  of  Ordinals 

For  every  a  E  O,  let  =  {d  G  O  [a  <  /?}.  Let  /o  be  the  ordering  function  of  Be  given  by 
proposition  6.11.  It  is  ea.sy  to  see  that  Ba  is  closed  and  unbounded.  Hence,  by  proposition 
6-14,  fa  is  a  normal  function.  We  shall  write  o  -b  /?  for  fai/S).  The  following  properties  of 
+  can  be  shown: 

a  <  a  +  0. 

0<'y=>a  +  0<Q  +  ‘f  (right  strict  monotonicity). 
li  a  <  0,  then  there  is  a  unique  7  such  that  a  +  ^  —  0. 

For  every  limit  ordinal  0  E  O,[0  O{0)  =  0,  and  a  +  0  =  |J{a  +  7  |  7  G  O{0)]. 

a  -P  Q  ~  a. 

a-E  0'  =  {a  +  0)'. 

0  <  a  -{■  0. 

0  +  0  =  0 

{a  +  0)  +  j  =  a  +  {0  +  j). 

a<0=^a  +  ')f<0  +  j  (left  weak  monotonicity). 

It  should  be  noted  that  addition  of  ordinals  is  not  commutative.  Indeed,  0'  -f  u;  = 
UN  =  u,  but  u;  <  tj  +  0^  by  right  strict  monotonicity.  Also, 

Definition  6.17  An  ordinal  q  G  0  is  a  principal  additive  ordinal  iff  a  0  and  for  every 
0  <  a,  0  +  a  —  Ck. 

Clearly,  1  =  0'  is  the  smallest  additive  principal  ordinal,  and  it  is  not  difficult  to  show 
that  u  is  the  least  additive  principal  ordinal  greater  than  1.  Note  that  a  -t-  1  =  o'. 

If  a  is  an  additive  principal  ordinal,  then  0{a)  is  closed  under  addition. 
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Proposition  6.18  The  set  of  additive  principal  ordinals  is  closed  and  unbounded. 

Proof .  First,  we  show  unboundedness.  Given  any  ordinal  a,  let  /3o  =  l^n+i  —  Pn  +  Pn, 
M  =  {Pn  I  n  G  N},  and  P  =  \_\M.  Since  Po  =  a'  >  0,  we  have  Pn  >  0  for  all  n  >  0, 
and  by  right  strict  monotonicity  of  +,  Pn  <  Pn  +  Pn  =  Pn+i-  Hence,  a  <  Pn  <  P  for  all 
n  >  0,  and  P  >  0.  li  rj  <  P,  then  there  is  some  n  >  0  such  that  rj  <  Pn-  Hence,  for  all 
m  >  n,  T]  +  Pm  <  Pm  +  Pm  =  Pm+1  <  P-  Hence,  \J{t]  +  Pn  I  n  G  N}  <  p.  But  we  also  have 
P  ^  Tl  +  P  —  U{^  +  Pn  I  €  N}  <  /9.  Hence,  rj  +  P  =  P  for  all  rj  <  p.  Therefore,  P  is  an 
additive  principal  ordinal. 

Next,  we  show  closure.  Let  M  be  a  nonempty  set  of  additive  principal  ordinals.  Since 
for  every  P  G  M,  >  0,  we  have  [J  M  >  0.  Let  r)  <  \_\M.  Then,  there  is  some  a  M 
such  that  <  a.  For  every  p  G  M,  if  ^  >  o,  then  rj  <  P,  and  since  P  is  additive  principal, 
T]  +  P  —  P.  Hence,  [Jlr;  +  P  \  P  £  M}  =  |J  M  for  all  rj  <  |J  M,  which  shows  that  |JM  is 
additive  principal.  □ 

By  proposition  6.14,  the  ordering  function  of  the  set  of  additive  principal  ordinals  is 
a  normal  function. 

Definition  6.19  The  ordering  function  of  the  set  of  additive  principal  ordinals  is  a  normal 
function  whose  value  for  every  ordinal  a  is  denoted  by  u;“. 

The  following  properties  hold. 

0  <a;". 

P  <  u;"  =>  =  LO°‘ . 

a  <  P  . 

For  every  additive  principal  ordinal  P,  there  is  some  a  such  that  p  =  uj° . 

For  every  limit  ordinal  /?,  =  |J{u;“  |a  G  0{P)'\. 

a  <  P  . 

u^  =  l. 

UJ^  =  U). 

The  following  result  known  as  the  Cantor  Normal  Form  for  the  (countable)  ordinals 
is  fundamental. 

Proposition  6.20  (Cantor  Normal  Form)  For  every  ordinal  a  £  O,  if  a  >  0  then  there 
are  unique  ordinals  such  that 

a  =  + - hu;"". 
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Proof.  First,  we  show  the  existence  of  the  representation.  We  proceed  by  transfinite 
induction.  If  a  is  an  additive  principal  ordinal,  then  a  =  u;®'  for  some  since  7  (->  u;"'' 
is  the  ordering  function  of  the  additive  principal  ordinals.  Otherwise,  there  is  some  S  <  a 
such  that  S  +  Q  ^  a.  Then,  since  a  <  <5  +  a  (by  proposition  6.9),  ^  >  0  and  S  +  a  >  a. 
Since  S  <  a,  there  is  some  r;  >  0  such  that  a  =  S  +  tj.  We  must  have  t]  <  a,  since  otherwise, 
by  right  monotonicity,  we  would  have  6  +  a<S  +  rj  =  a,  contradicting  6  +  a  >  a.  Hence, 
Q  =  6  +  T],  with  0  <  6, 7  <  a.  By  the  induction  hypothesis,  6  =  u°'  +  •••  +  u®"'  and 
77  =  07^*  +  •  •  •  +  for  some  ordinals  such  that  o  j  >  . . .  >  and  /?i  >  . . .  >  /?„■  If  we 
had  ai  <  for  all  i,  1  <  i  <  m,  then  we  would  have  6  +  rj  =  1]  (using  the  fact  that  for 
additive  principal  ordinals,  if  a  <  ^,  then  cu®  +  =  a;^),  that  is,  a  =  rj,  contradicting  the 

fact  that  rj  <  a.  Hence,  there  is  a  largest  k,  1  <  k  <  m  such  that  ai,  >  .  Consequently, 

O]  >  .  .  .  >  ak  >  01  >...>/?«,  and  since  u;®-’  +  u>^'  —  for  k  +  1  <  j  <  m,  we  have 

o  =  (5  +  7 

=  w®‘  +  •  •  •  +  w®*  +  0;®*+'  +...  +  u;®--fu;^>+---+u;^" 

=  w"'  +  •  •  •  +  a>®‘  +  07^'  +  .  •  •  +  . 

Assume  a  =  +  •  •  •  +  lo°”'  =  +  •  •  •  +  Uniqueness  is  shown  by  induction  on 

m.  Note  that  a  +  u;®‘  =  a^®‘,  which  implies  that  a  <  u;®>  (by  right  strict  monotonicity, 
since  a;®i  >  0),  and  similarly,  a  <  If  we  had  0[  <  oj,  we  would  have  <  a, 

contradicting  the  fact  that  o  <  Hence,  Oi  <  /?'.  Similarly,  we  have  0i  <  a\.  But 

then,  Qfi  <  01  and  0i  <  oi,  and  therefore,  ai  =  0i.  Hence,  either  m  =  n  =  1,  or  m, n  >  1 

and  cj®^  +  ■  ■  •  +  u;®”‘  =  +  •  •  •  +  We  conclude  using  the  induction  hypothesis.  □ 

As  we  shall  see  in  the  next  section,  there  are  ordinals  such  that  u;®  =  q,  and  so,  we 
cannot  ensure  that  q,  <  a.  However,  if  n  >  1,  by  right  strict  monotonicity  of  +,  it  is  true 

that  u;®*  <  a,  1  <  7  <  n.  We  are  now  ready  to  define  some  normal  functions  that  will  lead 

us  to  the  definition  of  Fq. 

6.5  a-Critical  Ordinals 

For  each  a  E.  O,  we  shall  define  a  subset  Cr(a)  C  O  and  its  ordering  function  inductively 
as  follows. 

Definition  6.21  For  each  a  G  O,  the  set  Cr(o)  C  O  and  its  ordering  function  :  Ac  —>■ 
Cr(a)  are  defined  inductively  as  follows. 

(1)  Cr(0)  =  the  set  of  additive  principal  ordinals,  Aq  =  O,  and  for  every  a  G  O,  <po(tt)  = 
a;®,  the  ordering  function  of  Cr(0). 
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(2)  Cr(a')  =  {tj  E  \  ^aiv)  =  'n}i  the  set  of  fixed  points  of  (fa,  and  (pai  :  Aq>  —>■  Cr(a') 
is  the  ordering  function  of  Cr(a'). 

(3)  For  every  limit  ordincil  ^  E  O, 

Cr{l3)  =  {r]  E  f]  Aa  \  ya  <  13,  ipa{r])  =  v}, 

a<0 

and  :  A^  — >  Cr{/3)  is  the  ordering  function  of  Cr(/3). 

The  elements  of  the  set  Cr(a)  are  called  a-criiical  ordinals.  The  following  proposition 
shows  that  for  o  >  0  the  o-critical  ordinals  are  the  common  fixed  points  of  the  normal 
functions  (pp,  for  all  <  o. 

Proposition  6.22  For  all  a,r}  E  O,  ‘\i  a  =  d  then  t]  E  Cr(0)  iff  t]  is  additive  principal, 
else  rj  E  Cr{a)  iff  r/  e  n/9<a  ^0  ^^(v)  =  V  ^or  all  ^  <  a. 

Proof .  We  proceed  by  transfinite  induction.  The  case  a  =  0  is  clear  since  Cr(0)  is  defined 
as  the  set  of  additive  principal  ordinals.  If  o  is  a  successor  ordinal,  there  is  some  jS  such 
that  a  =  /?'.  By  the  induction  hypothesis,  77  €  Cr(0)  iff  77  G  n7</j  ^yiv)  —  V  for  all 

'y  <  /3.  By  the  definition  of  Cr(l3'),  77  €  Cr{0')  =  Cr(a)  iff  77  G  A/j  and  ppir])  =  77.  Hence, 
since  a  =  0',  r)  E  Cr{a)  iff  77  G  n7<a  ^7  Pyiv)  =  V  for  all  7  <  a.  If  o  is  a  limit  ordinal, 
the  property  to  be  shown  is  clause  (3)  of  definition  6.21.  □ 

The  following  important  result  holds. 

Proposition  6.23  Each  set  Cr{a)  is  closed  and  unbounded. 

Proof .  We  show  by  transfinite  induction  that  Cr(a)  is  closed  and  unbounded  and  that 

A„  =  O. 

Proof  of  closure.  For  a  =  0  this  follows  from  the  fact  the  the  set  of  additive  principal 
ordinals  is  closed.  Assume  a  >  0,  and  let  M  C  Cr{a)  be  a  nonempty  countable  subset 
of  Cr(a).  By  the  induction  hypothesis,  for  every  0  <  a,  Cr{0)  is  closed  and  A^  —  O. 
Hence,  by  proposition  6.13,  pp  is  continuous.  Hence,  =  |JM  for  all  0  <  a.  By 

proposition  6.22,  since  we  also  have  A^  =  O  for  all  0  <  a,  this  implies  that  [JM  G  Cr(a). 
Hence,  Cr{a)  is  closed. 

Proof  of  Unboundedness.  For  a  =  0,  this  follows  from  the  fact  that  the  set  of  additive 
principal  ordinals  in  unbounded  and  that  Aq  =  O.  Assume  a  >  0.  Given  any  ordinal  0, 
let  70  =  0',  7„+i  =  U{<^n(7r.)  I  77  <  a},  M  =  {7„  I  77  G  N},  and  7  =  [jM.  By  the 
induction  hypothesis,  for  every  8  <  a,  C'r(^)  is  unbounded,  and  so  7„  is  well  defined  for 
all  77  >  0.  We  have  /d  <  70  <  7.  For  every  <5  <  a,  we  have  psiln)  <  7n+i  <  7,  and  so 
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U{v^^(7n)  lln  £  M]  <  7.  By  the  induction  hypothesis,  for  every  S  <  a,  Cr{6)  is  closed  and 
unbounded  and  Af,  =  O.  Hence,  is  continuous  and 

=  |J{(^6(7n)  |7n  e  M). 

Hence,  <  7-  By  proposition  6.9,  we  also  have  7  <  Hence,  7  =  P6{l)  for  all 

S  <  oi.  By  proposition  6.22,  we  have  7  6  Cr{a),  and  7  is  an  o-critical  ordinal  >  /3.  Hence 
Cr{a)  is  unbounded,  and  so  Aq  —  O.  [A 

Proposition  6.23  has  the  following  corollary. 

Proposition  6.24  For  every  a  G  O,  Aa  =  O  and  ^pc,  is  a  normal  function. 

In  view  of  proposition  6.24,  since  every  function  (fa  has  domain  O,  we  can  define  the 
function  (p  :  O  x  O  —y  O  such  that  (p(o,/3)  =  for  all  Q,f3  G  O.  From  definition  6.21 

and  proposition  6.24,  we  have  the  following  useful  properties. 

Proposition  6.25  (1)  r/  G  Cr{a')  iff  p>(a,j])  =  7. 

(2)  For  a  limit  ordinal  Cr{/3)  =  f]c^pCr{a). 

Proposition  6.26  (1)  If  a  <  /?  then  Cr{/3)  C  Cr{a). 

(2)  Every  ordinal  <p{a,f3)  is  an  additive  principal  ordinal. 

(3)  <^(0,/?)  =  a;^. 

An  ordinal  a  such  that  a  G  Cr{a)  is  particularly  interesting.  Actually,  it  is  by  no 
means  obvious  that  such  ordinals  exist,  but  they  do,  and  Fq  is  the  smallest.  We  shall 
consider  this  property  in  more  detail. 

It  is  interesting  to  see  what  are  the  elements  of  Cr(l).  By  the  definition,  an  ordinal 
o  is  in  Cr’(l)  iff  =  a.  Such  ordinals  are  called  epsilon  ordinals,  because  their  ordering 
function  is  usually  denoted  by  e.  The  least  element  of  Cr(l)  is  cq.  It  can  be  shown  that  €0 
is  the  least  upper  bound  of  the  set 

.u/ 

{W  'I 

,...,u;  ,...}. 

This  is  already  a  rather  impressive  ordinal.  What  are  the  elements  of  Cr(2)?  Well,  denoting 
the  ordering  function  of  Cr(l)  by  e,  a  G  Cr{2)  iff  Cq  =  a.  We  claim  that  the  smallest  of 
these  ordinals  is  greater  than 


’  •  •  •  ’  '"feo  ’  ■  ■  ■ 
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Amazingly,  the  ordinal  To  dwarfs  the  ordinals  just  mentioned,  and  many  more! 

The  following  proposition  gives  a  rather  explicit  characterization  of  (fa'  in  terms  of 
fixed  points.  It  also  shows  that  the  first  element  of  Cr{a')  is  farther  down  than  the  first 
element  of  Cr(a)  on  the  ordinal  line  (in  fact,  much  farther  down). 

Proposition  6.27  For  each  a^l3  E  O,  let  and  every 

n  >  0.  Then,  we  have 

'p.'(O)  =  U  rf(0). 

n>0 

^a'i/3')  =  [_j  +  1), 

n>0 

=  IJ 

for  a  limit  ordinal  /?.  Furthermore,  v?o(0)  <  v?a'(0)  for  all  oi  E  O. 

Proof .  Since  ipa  is  a  normal  function,  by  proposition  6.15,  lJn>o  least  fixed 

point  of  Pa,  and  for  every  E  O,  Un>o +  1)  is  the  least  fixed  point  of  pa  that 
is  >  pa'{0).  Since  pa>  enumerates  the  fixed  points  of  pa,  Pa'{0')  =  Un  >0  PaiPa'i/^)  +  !)• 

Assume  that  ^  is  a  limit  ordinal.  From  the  proof  of  proposition  6.4,  we  know  that 
=  U{7  I  7  <  Since  pa>  is  continuous,  we  have 

=  <^a'(|J{7  I  7  <  ^})  =  LI  Pa'il)- 

KP 

Since  0  <  <^a(0),  it  is  easily  shown  that  <^”(0)  <  </Pa'*’^(0)  for  all  u  >  0  (using  induction 
and  the  fact  that  pa  is  strictly  monotonic),  and  so,  v?S(0)  <  (/?a'(0).  Since  ¥>^(0)  =  <y?a(0), 
the  first  element  of  Cr{a),  we  have  y5a(0)  <  Pa'(O).  □ 

Proposition  6.27  justifies  the  claim  we  made  about  eo,  and  also  shows  that  the  first 
element  of  Cr(2)  is  the  least  upper  bound  of  the  set 

5---} 

"  ‘0 

It  is  hard  to  conceive  what  this  limit  is!  Of  course,  things  get  worse  when  we  look  at  the 
first  element  of  Cr{3),  not  to  mention  the  notational  difficulties  involved.  Can  you  imagine 
what  the  first  element  of  C7r(eo)  is?  Well,  Fq  is  farther  away  on  the  ordinal  line! 

The  following  proposition  characterizes  the  order  relationship  between  p{ai,^i)  and 
p(a2,/32). 
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Proposition  6.28  (i)  =  (p{a2,^2)  iff  either 

(1)  oi  <  02  and  /Si  =  (p{a2,02)-,  or 

(2)  oi  =  02  and  /?i  =  (^2,  or 

(3)  02  <  Oi  and  (^(oi,/?!)  =  h- 

(ii)  (^(oi,/3i)  <  (^(02,^2)  iff  either 

(1)  oi  <  02  and  <  ^{012,1^2),  or 

(2)  oi  =  02  and  /Sj  <  /52,  or 

(3)  02  <  Qi  and  i;3(Qi,/3i)  <  ^2- 

Proof  (sketch).  We  sketch  the  proof  of  (ii).  By  the  definition  of  (^(02, /?2)  £  C'7'(o2).  If 
Oi  <02,  by  proposition  6.22,  (^(02,^2)  is  a  fixed  point  of  (Pd  >  and  so, 

(^(oi,(/5(o2,^2))  =  (^(02,^2)- 

Since  is  strictly  monotonic,  (^(oi,^i)  <  (^(o] , v?(o2, /?2))  iff  /^i  <  'p{(^2^02)-  The  ca.se 

where  02  <  oi  is  similar.  For  oi  =  02,  the  assertion  follows  from  the  fact  that  ^pc,^  is 
strictly  monotonic.  □ 

Using  proposition  6.9,  since  each  function  (pa  is  an  ordering  function,  we  have  the 
following  useful  property. 

Proposition  6.29  For  all  q,^  €  O,  /?  <  p>{a,0). 

By  proposition  6.28  and  6.29,  we  also  have  the  following. 

Corollary  6.30  For  all  ai,Q2,^i,/92  ^  O.,  if  oj  <  Q2  and  0^  ^02,  then  < 

^(Oi2,02)- 

The  following  can  be  shown  by  transfinite  induction. 

Proposition  6.31  (i)  For  every  a  ^  O,  a  <  (^(a,0).  Furthermore,  0  ^  Cr(a),  then 

a  <  0. 

(ii)  li  a  <  0,  then  (p{a,0)  <  ip{0,a). 

Proof .  We  show  a  <  (/7(q;,0)  by  transfinite  induction.  This  is  clear  for  0  =  0.  If  o  >  0,  for 
every  /?  <  a,  by  strict  monotonicity  and  proposition  6.22,  (/?(/3,0)  <  (/?(/?,  (^7(0, 0))  =  (y7(a,0), 
since  (/7(a,0)  >  0  is  a  fixed  point  of  pp.  By  the  induction  hypothesis,  we  have  0  <  9(/5,0), 
and  so  0  <  (/7(a,0)  for  all  0  <  a.  By  proposition  6.4,  this  implies  that  a  <  ip(a,0). 
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G  Cr(a)  iff  ^  =  (p{a,  77)  for  some  rj,  and  since  a  <  (p{a,  0),  by  monotonicity,  we  have 

a  <  (7?(q;,0)  <  (p{a,T])  =  /?. 

Assume  a  <  /?.  Since  /?  <  <^(^,0),  we  also  have  (3  <  (p(l3,a).  By  proposition  6.28, 
since  a  <  j3  and  <  (p(^,oi).  □ 

Another  key  result  is  the  following. 

Proposition  6.32  For  every  additive  principal  ordinal  7,  there  exist  unique  G  O 
such  that,  o  <  7,  <  7,  and  7  =  (p(a,j3). 

Proof .  Recall  that  an  additive  principal  ordinal  is  not  equal  to  0.  By  proposition  6.31, 
7  ^  0).  Since  0  <  7,  by  strict  monotonicity  of  ip-y,  9(7, 0)  <  9(7, 7),  and  so  7  <  ¥^(7, 7). 

Since  O  is  well-ordered,  there  is  a  least  ordinal  a  <  7  such  that  7  <  97(0,7).  If  o  0,  the 
minimality  of  o  implies  that  97(77,7)  =  7  for  all  77  <  o,  and  by  proposition  6.22,  7  6  Cr(o). 
If  o  =  0,  since  7  is  an  additive  principal  ordinal,  by  the  definition  of  Cr(0),  o  G  C'r(O). 
Hence,  7  G  Cr(o).  Hence,  there  is  some  (3  such  that  7  =  97(0,  ,d).  Since  7  <  97(0,7),  by 
strict  monotonicity  of  97^,  we  must  have  /?  <  7. 

It  remains  to  prove  the  uniqueness  of  o  and  I3.  If  /?i  <  j,  ^2  <  7,  and  7  =  97(01 ,  ^1)  = 
97(02, /52))  by  proposition  6.28,  we  must  have  oi  =  02  and  /3i  =132.  □ 

Observe  that  the  proof  does  not  show  that  o  <  7,  and  indeed,  this  is  not  necessarily 
true.  Also,  for  an  ordinal  7,  7  =  97(7,/?)  holds  for  some  /?  iff  7  G  Cr{j).  Such  ordinals  exist 
in  abundance,  as  we  shall  prove  next. 

Definition  6.33  An  ordinal  o  G  d  is  a  strongly  critical  ordinal  iff  o  G  Cr(o). 
Proposition  6.34  An  ordinal  o  is  strongly  critical  iff  97(0,6)  =  o. 

Proof.  If  a  G  C'r(o),  there  is  some  such  that  o  =  97(0, /3).  By  proposition  6.31,  we  have 
o  <  97(0,0),  and  by  strict  monotonicity  of  ipa,  we  have  /?  =  0.  Conversely,  it  is  obvious 
that  97(0,0)  =  o  implies  o  G  Cr(a).  □ 

Let  if  :  O  O  he  the  function  defined  such  that  if  (a)  =  97(0,6)  for  all  o  G  O.  We 
shall  prove  that  if  is  strictly  monotonic  and  continuous.  As  a  consequence,  if  is  &  normal 
function  for  the  set  {97(0,6)  |  o  G  O). 

Proposition  6.35  The  function  if  (also  denoted  by  97(  — ,0))  defined  such  that  if  (a)  = 
97(0, 0)  for  all  o  G  O  is  strictly  monotonic  and  continuous. 

Proof .  First,  we  prove  the  following  claim. 


Draft/Septemher  30,  1998 


34 


WHAT’S  SO  SPECIAL  ABOUT  KRUSKAL’S  THEOREM? 


Claim:  xj)  satisfies  the  following  properties: 


V>(0)  =  v»(0,0), 

W)  =  U 

n>0 

V’(^)  =  U 
6<0 

for  a  limit  ordinal 

Proof  of  claim.  By  definition,  V’(O)  =  9(0,0),  and  the  second  identity  follows  from  propo¬ 
sition  6.15,  since  9^(0)  =  9(/3,0)  =  '0(/?),  which  implies  that  9^(V’(/?))  =  9^^'(0)  for  all 
n  >  0.  By  proposition  6.22,  j/>(/?)  =  9(/d,0)  =  where  770  is  the  least  ordinal  such  that 
PiUiV)  —  V  7  <  For  every  7  <  /?,  since  is  continuous, 

U  m)  =  U  9(7, 

S<I3  6<0 

=  U  V^(7,‘»?(<5,0)). 

6<0 

For  (5  >  7,  we  have  9(7, 9(6, 0))  =  9(<5, 0)  =  xf{S),  and  since  9  is  monotonic  in  both 
arguments, 

y  <*^(7,S5(<5,0))  =  y  xf{6). 
s<0  6<p 

Hence, 

□  ,1,(6))  =  □  4,(6). 

S<p  6<0 

for  all  'y  <  /3,  which  shows  that  770  <  (because  770  is  the  least  such  common 

fixed  point).  On  the  other  hand,  xp{S)  =  9(<5,0)  <  9(6,770)  =  770  for  all  8  <  (3.  Hence, 
V’(<5)  <  Tjo-  But  then,  ^  xf{8)  =  770  =  V’(i^)-  □ 

We  can  now  show  that  if  is  continuous.  Let  M  be  a  nonempty  countable  subset  of  O, 
and  let  =  [J  M.  The  case  /?  =  0  is  trivial.  If  /?  =  a'  for  some  q,  we  must  have  0  6  M,  since 
otherwise  0  would  not  be  the  least  upper  bound  of  M  (either  7  <  a  for  all  7  G  M,  or  7  >  q 
for  some  7  G  M,  a  contradiction  in  either  case).  But  then,  7/>([J  M)  =  W)  =  UaGMV^(«)> 
since  if  is  monotonic.  If  /?  =  |JM  is  a  limit  ordinal,  then  0  =  \_\M  =  \_\{8  \  8  <  0] .  Hence, 
for  every  a  G  M,  there  is  some  8  <  0  such  that  a  <  8,  and  conversely,  for  every  8  <  0, 
there  is  some  a  £  M  such  that  8  <  a.  By  monotonicity  of  if,  this  implies  that 


y  V’(a)  =  y  ^{s). 

q£M  6<0 
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By  the  claim, 

V.(U  Af)  =  r/.(/J)  =  □V’W, 

«</? 

and  therefore, 

V>(UM)=  LI  ^(a), 

aGM 

showing  that  ?/>  is  continuous. 

Finally,  we  show  that  V’  is  strictly  monotonic.  Since  (p  is  monotonic  in  both  arguments, 
=  97(-,0)  is  monotonic.  Assume  a  <  f3.  Then  a  <  a'  <  l3  and  by  proposition  6.27, 

V>(a)  <  rp{a')  <  □ 

Proposition  6.35  implies  that  there  are  plenty  of  strongly  critical  ordinals. 

Proposition  6.36  The  set  of  strongly  critical  ordinals  is  closed  and  unbounded. 

Proof .  First,  we  prove  unboundedness.  Since  if  =  v?(-,  0)  is  a  normal  function,  by  proposi¬ 
tion  6.22,  for  any  arbitrary  ordinal  a,  ip  has  a  least  fixed  point  >  a.  Since  such  fixed  points 
are  strongly  critical  ordinal,  the  set  of  strongly  critical  ordinals  is  unbounded. 

Next,  we  prove  that  the  set  of  strongly  critical  ordinals  is  closed.  Let  M  be  a  nonempty 
countable  set  of  strongly  critical  ordinals.  For  each  a  G  M,  we  have  '0(q;,O)  =  a.  Hence, 
xp[M)  =  M.  Since  ip  =  v?(— ,0)  is  continuous,  we  have  ip{\_\M)  —  \_\ip{M)  =  \JM.  This 
shows  that  [JM  is  a  strongly  critical  ordinal,  and  therefore,  the  set  of  strongly  critical 
ordinals  is  closed.  □ 

From  proposition  6.36,  the  ordering  function  of  the  set  of  strongly  critical  ordinals 
is  a  normal  function.  This  function  is  denoted  by  F,  and  r(0),  also  denoted  Fq,  is  the 
least  strongly  critical  ordinal.  Fq  is  the  least  ordineil  such  that  9?(q;,0)  =  a.  The  following 
proposition  shows  that  C?(ro)  is  closed  under  -|-  and  <p. 

Proposition  6.37  For  all  a, ^  G  Cl,  if  a, ^  <  Fq,  then  a  +  <  Fq,  and  (p{a,^)  <  Fq. 

Proof  (sketch).  Since  Fq  is  an  additive  principal  ordinal,  closure  under  -t-  is  clear.  Let  70  =  0, 
Jn+I  =  ¥?(7n,0),  77  =  {7„  I  n  G  N},  and  7  =  1J77.  By  proposition  6.15,  we  have  7  =  Fq. 
Now,  if  a,/?  <  Fq,  since  Fo  =  there  is  some  7„  such  that  o,^  <  7„.  By  proposition 
6.28,  we  have  <  <^(7n,0),  because  /?  <  7n  <  <^(7n)0)-  Hence,  (p{a,^)  <  jn+i  <  Fq. 

□ 

Proposition  6.37  shows  that  Fq  cannot  be  obtained  from  strictly  smaller  ordinals  in 
terms  of  the  function  -f  and  the  powerful  functions  tpa-,  a  <  Fq.  As  Smorynski  puts  it  in 
one  of  his  articles  [50] , 
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‘To  is  the  first  countable  ordinal  which  cannot  be  described  without  reference 
(if  only  oblique)  to  the  uncountable.” 

Indeed,  referring  to  Fq  as  the  least  ordinal  a  satisfying  a  =  ip(a,0)  is  indirect  and 
somewhat  circular  -  the  word  “least”  involves  reference  to  all  ordinals,  including  Fq.  One 
could  claim  that  the  definition  of  Fq  as  lJ{7n  |  n  G  N),  as  in  proposition  6.37,  is  “construc¬ 
tive”,  and  does  not  refer  to  the  uncountable,  but  this  is  erroneous,  although  the  error  is 
more  subtle.  Indeed,  the  construction  of  the  function  (/?(  —  ,  0)  is  actually  an  iteration  of  the 
functional  taking  us  from  ip{ck,  — )  to  — ),  and  therefore,  presupposes  as  domain  of  this 

functional  a  class  of  functions  on  ordinals  and  thus  (on  close  examination)  the  uncountable. 
As  logicians  say,  the  definition  of  the  ordinal  Fq  is  im.predicn.tivc . 
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What  have  we  accomplished  in  section  6.5?  If  one  examines  carefully  the  proofs  of  proposi¬ 
tions  6.23,  6.24,  6.27,  6.28,  6.31,  6.35,  and  definition  6.21,  one  discovers  that  the  conditions 
that  make  everything  go  through  axe  the  fact  that  o:  i->  is  a  normal  function,  c/?  such  that 
0  <  v^(0).  This  suggests  the  following  generalization. 

Definition  7.1  Given  any  normal  function  ip  such  that  0  <  <(?(0),  mimicking  definition 
6.21,  we  define  the  hierarchy  {p^oi)aeo  of  functions  such  that, 

•  7^0  “  every  a  >  0, 

•  enumerates  the  set  {77  |  <*2^(77)  =  77,  for  all  yd  <  a}  of  common  fixed  points  of  the 

functions  for  all  yd  <  a. 

We  have  what  is  called  a  Veblen  hierarchy  (a  concept  due  to  Veblen  [53]),  and  according 
to  our  previous  remark,  the  following  properties  hold. 

Theorem  7.2  (Veblen  Hierarchy  theorem)  Denoting  each  function  as  p°{a,—),  each 
p^{a,—)  is  a  normal  function,  and  the  function  — ,0)  :  a  (,?‘’(a,0)  is  also  a  normal 
function  such  that  0  <  </7°(0,0). 

But  since  — ,0)  satisfies  the  conditions  for  building  a  Veblen  hierarchy,  we  can 
iterate  the  process  just  described  in  definition  7.1.  For  this,  following  Larry  Miller  [34],  it 
is  convenient  to  define  an  operator  Ai  on  normal  functions,  the  1  -diagonalization  operator, 
defined  as  follows. 

Given  a  normal  function  p  such  that  0  <  <^(0),  Ai((/?)  is  the  normal  function  enumer¬ 
ating  the  fixed  points  of  — ,0). 

Note  that  in  a  single  step,  Ai  performs  the  Q  iterations  producing  the  Veblen  hierarchy 

}a<n!  (where  Q  denotes  the  first  uncountable  ordinal,  i.e.,  the  order  type  of  O).  Using 
the  operator  Ai,  we  can  define  a  sequence  {p^g}g<u  of  normal  functions,  and  so,  a  sequence 
of  Veblen  hierarchies  -  or  a  doubly  indexed  sequence  of  normal  functions  -  {v^^(7,  — )},3,7<n 
defined  as  follows: 

•  yjJ,  =  Ai(v7^),  and 

•  is  the  normal  function  enumerating  f\^^^range[p]^),  for  a  limit  ordinal  y9. 

But  yd  1—7  <,^^(0)  (also  denoted  (^^(—,0))  is  also  a  normal  function  such  that  0  <  V’o(O)- 
Hence,  we  can  define  an  operator  A2  enumerating  the  fixed  points  of  yd  1—7  <,5^(0),  and  build 
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a  hierarchy.  But  we  can  iterate  the  operator  A  into  the  transfinite!  This  leads  to  the 
following  definition. 

Definition  7.3  Given  a  normal  function  (p  such  that  0  <  we  define  by  simultaneous 
induction  the  fi-indexed  sequence  {AQ)o<n  of  diagonalization  operators  and  the  doubly 
ri-indexed  sequence  {v5^}o,^<n  of  normal  functions  as  follows. 

•  Ao(V’)  enumerates  the  fixed  points  of  the  normal  function  V’; 

•  Aa'(p>)  =  Ao(v5“(  — ,0))  enumerates  the  fixed  points  of  (/?"(  — ,0)  :  <^^(0); 

•  Ao(</>)  enumerates  n7<a  ^^or  a  limit  ordinal  a; 

•  =  Aaip^y, 

•  (p^  enumerates  range{p^),  for  a  limit  ordinal  /?. 

It  is  convenient  to  keep  track  of  the  diagonalization  level  (the  index  a)  and  the  number 
of  iterations  of  diagonalizations  of  level  a  (the  index  (3)  by  using  indices  beyond  Indeed, 
using  the  families  {p^}a,0<9,  and  the  representation  of  the  ordinals  in  base  Q,  it  is  possible 
to  extend  our  original  fl-indexed  hierarchy  -)}/?<n  (dropping  the  superscript  0  in 

p^)  to  an  fl^-indexed  hierarchy  {p{S,  — Let  us  first  consider  the  simple  case  where 

a  =  1. 

Using  the  fact  that  every  ordinal  6  <  is  uniquely  expressed  as  (5  =  Q.l3i  -\-^2  for  some 
ordinals  <  Q,  we  can  extend  the  fi-indexed  hierarchy  {p(/3,  —)}0<n  to  an  fi^-indexed 

hierarchy  — )}i<n2  as  follows.  For  any  y0i,^2  <  we  let 

p(Q/3i  +  /?2,  — )  =  . 

With  this  convention  applied  to  the  function  u;(  — )  :  q  h->  u;“  and  the  fi^-indexed  se¬ 
quence  {a;(<5, ,  note  that  tjJ  =  Ai(u;(-))  =  Ao(a;°(-,0))  is  denoted  by  a;(f^,-), 
and  u;(n,0)  =  Fq  denotes  the  least  fixed  point  of  c<;‘^(  — ,0).  Similarly,  uf  =  A2(lj(  —  ))  = 
Ao(a;^(  — ,0))  is  denoted  by  —),  and  u;(fi^,0)  denotes  the  least  fixed  point  of  u;^(  — ,0). 

In  general,  since  every  ordinal  6  <  Q.^  is  uniquely  expressed  as 

S  =  01  +  •  •  •  -b  0n 

for  some  ordinals  On  <  . . .  <  Oi  <  and  0i,...,0n  <  we  can  regard  the  multiply 
fi-indexed  sequence 

Draft/ September  30,  1993 


8  Normal  Form  For  the  Ordinals  <  To 


39 


as  an  i7^-indexed  sequence  — )}«<nn»  if  we  put 

A  +  ...+ -)  =  (.. .  ). 

Hence,  a  constructive  ordinal  notation  system  for  the  ordinals  less  than  (/?(J7^,0),  the  least 
fixed  point  of  ^  0)  {8  <  0^),  can  be  obtained  using  the  families 

{(■  ■  ■  )}a„<...<ofn<n,/?i,  ..,i3n<n- 

It  is  possible  to  go  farther  using  Bachmann- Isles  hierarchies,  but  we  are  already  quite  dizzy, 
and  refer  the  reader  to  Larry  Miller’s  paper  [34].  Readers  interested  in  the  topic  of  ordinal 
notations  should  consult  the  very  nice  expository  articles  by  Crossley  and  Bridge  Kister  [5], 
Miller  [34],  and  Pohlers  [42],  and  for  deeper  results,  Schiitte  [46]  and  Pohlers  [41]. 

8  Normal  Form  For  the  Ordinals  <  Fq 

One  of  the  most  remarkable  properties  of  Po  is  that  the  ordinals  less  than  Pq  can  be 
represented  in  terms  of  the  functions  +  and  9.  First,  we  need  the  following  lemma. 

Lemma  8.1  Given  an  additive  principal  ordinal  7,  if  7  =  9(0,  /?),  with  o  <  7  and  /S  <  7, 
then  a  <  7  iff  7  is  not  strongly  critical. 

Proof .  By  proposition  6.31,  we  have  7  <  9(7,0).  By  proposition  6.28,  since  a  <  7  and 
0  <  ‘j  <  9(7,0),  we  have  7  =  9(0,^)  <  9(7,0)  iff  a  <  7.  By  proposition  6.34  and 
proposition  6.31,  7  is  not  critical  iff  7  <  9(7,0),  iff  a  <  7  from  above.  □ 

We  can  now  prove  the  fundamental  normal  form  representation  theorem  for  the  ordi¬ 
nals  less  than  Pq. 

Theorem  8.2  For  every  ordinal  a  such  that  0  <  a  <  Po,  there  exist  unique  ordinals 
Oi, . . . ,  a„,^a, . . . ,  0n,  «  >  1,  with  <  ip{ai,0i)  <  a,  1  <  i  <  n,  such  that 

(1)  a  =  (p(ai,0i)  -i - h  9(«n,/?n),  and 

(2)  9(01,  A)  >  . . .  >  9(«n,/5n). 

Proof .  Using  the  Cantor  Normal  Form  for  the  (countable)  ordinals  (proposition  6.20),  there 
are  unique  ordinals  rji  >  . . .  >  rjn,  n  >  I,  such  that 

a  =  4-  •  •  •  -f  a;’'". 

Each  ordinal  is  an  additive  principal  ordinal,  and  let  7^  =  .  By  Proposition  6.32,  for 

every  additive  principal  ordinal  7i,  there  exist  unique  a;,-,  0i  E  O  such  that,  a,-  <  7,-,  0i  <  7,-, 
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and  7,-  =  Since  for  each  ordinal  7,-,  we  have  7i  <  o  <  Fq,  and  Fq  is  the  least 

strongly  critical  ordinal,  by  proposition  8.1,  a,  <  7^.  Since  ji  <  a,  ai  <  7,,  and  <  7,,  we 
have  tti  <  a  and  /?,  <  a.  Property  (2)  follows  from  the  fact  that  rji  >  . . .  >  t]^  implies  that 
7i  >  •  •  •  >  7n  (since  7,  =  ).  □ 

We  need  a  few  more  properties  of  the  ordinals  less  than  Fo  before  we  establish  the 
connection  between  Fq  and  KruskaFs  theorem. 

Lemma  8.3  For  all  o.  .'i  <  Fq,  if  o  <  /?,  then 

o</?<^  +  a<  a), 
and  if  o  <  /?  and  ;i  <  S).  then 

/3  +  o  <  ^pia,/3)  <  <p{/3,a). 

Proof .  That  a  <  l3  <  }S  + a  is  easy  to  show.  If  o  =  0,  since  by  proposition  6.31,  0  <  (p(0,  0), 
we  have  0  +  0  =  0  <  if  (0,0).  U  0  <  a  =  0,  we  have  shown  earlier  that  a  <  (,:7(o,q)  (in  the 
proof  of  proposition  6.32),  and  since  f{a,Q)  is  an  additive  principal  ordinal,  we  also  have 
a  +  a  <  (^(a,  o).  If  0  <  o  <  /?,  by  proposition  6.29,  we  have  0  <  f(O,0),  and  by  proposition 
6.31,  we  have  0  <  f(0,O).  By  strict  monotonicity  of  since  q  >  0,  we  have  0  <  f(0,a). 
Hence,  a  <  0  <  f(0,a).  By  proposition  6.28,  f{O,0)  <  f(0,a),  since  0  <  f{0,a).  Hence, 

0  +  a  <  f(O,0)  +  f{0,a)  =  f(0,a), 

since  f(O,0)  <  f{0,a)  and  f{0,Q)  is  an  additive  principal  ordinal. 

Now  assume  Q  <  0  and  0  <  f(a,0).  If  a  =  0,  since  by  proposition  6.29,  0  <  ip(O,0), 
we  have  0  +  0  =  0  <  (,^(0,  0).  If  0  <  q  =  /?,  the  proof  is  identical  to  the  proof  of  the  previous 
case.  If  0  <  a  <  0,  then  by  proposition  6.28,  f>(O,0)  <  f{a,0),  since  0  <  f(a,0).  We  can 
also  show  that  a  <  <^(q,  0)  as  in  the  previous  case  (since  0  >  0),  and  we  have 

0  +  Q  <  f(O,0)  +  f{a,0)  =  f{a,0), 

since  f(0, 0)  <  f(oi,  0)  and  ip(a,  0)  is  an  additive  principal  ordinal.  The  fact  that  f(a,  0)  < 
f(0,a)  if  a  <  0  was  shown  in  proposition  6.31.  □ 

It  should  be  noted  that  ii  a  <  0,  when  0  =  f(a,0)  (which  happens  when  0  G  Cr(a')), 
the  inequality  0  +  a  <  f(a,0)  is  incorrect.  This  minor  point  noted  at  the  very  end  of 
Simpson’s  paper  [47,  page  117]  is  overlooked  in  one  of  Smoryhski’s  papers  [51,  page  394]. 
In  the  next  section,  we  will  correct  Smoryhski’s  defective  proof  (Simpson’s  proof  is  also 
defective,  but  he  gives  a  glimpse  of  a  “repair”  at  the  very  end  of  his  paper,  page  117). 

By  theorem  8.2,  the  ordinals  less  than  Fq  can  be  defined  recursively  as  follows. 
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Lemma  8.4  For  every  ordinal  7  <  Fq,  either 

(1)  7  =  0,  or 

(2)  7  =  +  a,  for  some  ordinals  <  7  such  that  a  <  /S,  or 

(3)  7  =  for  some  ordinals  a,/3  <'y. 

Proof .  The  proof  follows  immediately  from  theorem  8.2  by  induction  on  n  in  the  decompo¬ 
sition  7  =  9?(ai,^i)  H - h  (^(an,/5n)-  □ 

In  case  (3),  we  cannot  guarantee  that  o;  <  /3,  and  we  have  to  consider  the  three 
subcases  a  <  /3,  a  =  and  a  >  /3.  Actually,  we  can  reduce  these  three  cases  to  two  if  we 
replace  <  by  <. 

This  recursive  representation  of  the  ordinals  <  Fq  is  the  essence  of  the  connection 
between  Fq  and  Kruskal’s  theorem  explored  in  section  9. 

Lemma  8.4  shows  that  every  ordinal  o  <  Fq  can  be  represented  in  terms  of  0,  +,  and  tp, 
but  this  representation  has  some  undesirable  properties,  namely  that  different  notations  can 
represent  the  same  ordinal.  In  particular,  for  some  a  <  0  <  Fo,  we  may  have  ^  =  p{a,/3) 
(which  happens  when  /?  G  Cr(a')).  For  example,  eo  =  <p(0,eo)  (since  €0  =  ‘r^(l)O)).  It 
would  be  desirable  to  have  a  representation  similar  to  that  given  by  lemma  8.2,  but  for  a 
function  V’  such  that  a  <  and  ^  for  all  <  Fq.  Such  a  representation 

is  possible,  as  shown  in  Schiitte  [46,  Section  13.7,  page  84-92].  The  key  point  is  to  consider 
ordinals  7  that  are  maximal  a-criiical,  that  is,  maximal  with  respect  to  the  property  of 
belonging  to  some  Cr(a). 

Definition  8.5  An  ordinal  7  G  O  is  maximal  a-critical  iff  7  G  Cr{a)  and  7  ^  Cr{P)  for 
all  13  >  a. 

By  proposition  6.22  and  proposition  6.23,  7  G  Cr{a)  iff  =  7  for  all  /3  <  a. 

Thus,  7  is  maximal  a-critical  iff  Pail)  /  7-  However,  because  pa  is  the  ordering  function 
of  Cr(a),  we  know  from  proposition  6.9  that  S  <  Pa(^)  for  all  6,  and  so,  7  is  maximal 
a-critical  iff  7  =  Pa(/3)  for  some  ^  <  j-  It  follows  from  proposition  6.32  that  for  every 
principal  additive  number  7,  there  is  some  a  <  7  such  that  7  is  maximal  a-critical. 

Definition  8.6  The  function  V’o  is  defined  as  the  ordering  function  of  the  maximal  a- 
critical  ordinals. 

We  also  define  ip(a,/3)  by  letting  =  tfa(/3)-  It  is  possible  to  give  a  definition 

of  if  in  terms  of  (y5,  as  shown  in  Schiitte  [46]. 
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Lemma  8.7  The  function  V’  defined  such  that 


V’(a,/d) 


'  +  1),  P  A- n  and  <p(cv,/?o)  =  /?o, 

<  for  some  /?o  and  n  G  N; 

,(/?(o,/3),  otherwise. 


is  the  ordering  function  of  the  maximal  O'-critical  ordinals  for  every  a. 

We  list  the  following  properties  of  t/)  without  proof,  referring  the  reader  to  Schiitte 
[46]  for  details. 


Lemma  8.8  For  every  additive  principal  number  7,  there  are  unique  a,/?  <  7  such  that 
7  =  ip{a,/3). 

Lemma  8.9  (1)  If  7  =  then  a  <  7  iff  7  is  not  strongly  critical. 

(2)  /3  <  ip(<y,/3)  for  all  cv,/?. 


Lemma  8.10  <  ^(<^2,02)  holds  iff  either 

(1)  Qi  <  Oi2  and  01  <  ■ip{a2,02),  or 

(2)  ai  =  0(2  and  0i  <02,  or 

(3)  02  <  Qi  and  rp{ai,0i)  <  02. 

It  should  be  noted  that  the  set  of  maximal  o-critical  ordinals  is  unbounded,  but  it  is 
not  closed,  because  the  function  tpa  is  not  continuous.  However,  this  is  not  a  problem  for 
representing  the  ordinals  less  than  Fq. 

Since  To  is  the  least  strongly  critical  ordinal,  by  lemma  8.9,  we  have  the  following 
corollary. 


Lemm  8.11  For  all  a,0  <  Fo,  we  have 

(1)  Q  <  0),  and 

(2)  0<^{a,0). 

Using  lemma  8.8,  we  can  prove  another  version  of  the  normal  form  theorem  8.2  for 
the  ordinal  less  than  Fq,  using  ^  instead  of 


Theorem  8.12  For  every  ordinal  a  such  that  0  <  q  <  Fq,  there  exist  unique  ordinals 
Qi ,..., On, /?!,..., /?n,  H  >  1,  with  oci,0i  <  'il;{Qi,0i)  <  ol,  \  <  1  <  Ti,  such  that 

(1)  a  =  V’(ai,/?i)  + - h  Tp{an,0n),  and 
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(2)  V’(q;i,  A)  >  •  •  •  > 

The  advantage  of  the  reprentation  given  by  theorem  8.12  is  that  it  is  now  possible 
to  design  a  system  of  notations  where  distinct  notations  represent  distinct  ordinals,  and 
V’  satisfies  the  subterm  property  of  lemma  8.11.  Such  a  notation  system  will  be  given  in 
section  11. 

9  Kruskal’s  Theorem  and  Fq 

The  connection  between  Fq  and  Kruskal’s  theorem  lies  in  the  fact  that  there  is  a  close 
relationship  between  the  embedding  relation  ■<  on  trees  (definition  4.11)  and  the  well¬ 
ordering  <  on  C?(ro)  (recall  that  0(To)  is  the  set  of  all  ordinals  <  Fq). 

We  shall  restrict  our  attention  to  tree  domains,  or  equivalently  assume  that  the  set  of 
labels  contains  a  single  symbol.  Let  T  denote  the  set  of  all  finite  tree  domains,  which,  for 
brevity  axe  also  called  trees.  In  this  case,  by  a  previous  remark,  it  is  easy  to  show  that  z<  is 
a  partial  order.  We  shall  exhibit  a  function  h  :  T  Cl(Fo)  from  the  set  of  finite  trees  to  the 
set  of  ordinals  less  that  Fq,  and  show  that  h  is  (1).  surjective,  and  (2).  preserves  order,  that 
is,  if  s  :<  t,  then  h(s)  <  h{t)  (where  ■;<  is  the  embedding  relation  defined  in  definition  4.11). 
It  will  follow  that  Kruskal’s  theorem  (theorem  4.12)  implies  that  C?(Fo)  is  well-ordered  by 
<,  or  put  slightly  differently,  Kruskal’s  theorem  implies  the  validity  of  transfinite  induction 
on  Fq.  In  turn,  the  provability  of  transfinite  induction  on  large  ordinals  is  known  to  be 
proof-theoreticaJly  significant.  As  first  shown  by  Gentzen,  one  can  prove  the  consistency  of 
logical  theories  using  transfinite  induction  on  large  ordinals.  As  a  consequence,  Kruskal’s 
theorem  in  not  provable  in  fairly  strong  logical  theories,  in  particular  some  second-order 
theories  for  which  transfinite  induction  up  to  Fo  is  not  provable. 

We  now  give  the  definition  of  the  function  h  mentioned  above.  In  view  of  the  recursive 
characterization  of  the  ordinals  <  Fq,  it  is  relatively  simple  to  define  a  surjective  function 
from  T  to  C?(Fo).  However,  making  h  order-preserving  is  more  tricky.  As  a  matter  of 
fact,  this  is  why  lemma  8.3  is  needed,  but  beware!  Simpson  defines  a  function  h  using 
five  recursive  cases,  but  points  out  at  the  end  of  his  paper  that  there  is  a  problem,  due 
to  the  failure  of  the  inequality  ^  +  a  <  [47,  page  117].  Actually,  a  definition  with 

fewer  cases  can  be  given,  and  Smorynski  defines  a  function  h  using  four  recursive  cases  [51]. 
Unfortunately,  Smorynski’s  definition  also  makes  use  of  the  erroneous  inequality  [51,  page 
394].  We  give  what  we  believe  to  be  a  repaired  version  of  Smoryhski’s  definition  of  h  (using 
five  recursive  cases). 

Remark.  We  do  not  know  whether  a  definition  using  the  function  V’  of  the  previous 
section  can  be  given.  Certainly  a  surjective  function  can  be  defined  using  V’,  but  the  difficult 
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part  is  to  insure  monotonicity. 

Definition  9.1  The  function  h  :  T  O(ro)  from  the  set  of  finite  trees  to  the  set  of 
ordinals  less  that  Fq  is  defined  recursively  as  follows: 

(0)  h{t)  —  0,  when  t  is  the  one-node  tree. 

(1)  h{t)  =  h(t/l),  if  rank{t)  =  1,  i.e,  the  root  of  t  has  only  one  successor. 

(2)  h{t)  =  /3  -f  Q,  if  rank{t)  =  2,  where  a  is  the  least  element  of  {h{t/\),  h{t/2)]  and  ^  is 
the  largest. 

(3)  h{t)  =  if  rank(t)  —  3,  where  a  <  (S  are  the  two  largest  elements  of  the  set 

{h{t/l),h{t/2),h{t/3)},  and  /?  < 

(4)  h(t)  =  /5  -f  o,  if  rank(t)  —  3,  where  Q  <  are  the  two  largest  elements  of  the  set 
{h{t/l),h(t/2),h(t/3)},  and  /?  =  ‘p{q,/3). 

(5)  h(t)  =  ip(/3,a),  if  rank{t)  >  4,  where  a  <  ^  are  the  two  largest  elements  of  the  set 

h(t/2), . . . ,  h(t/k)},  with  k  =  rank{t). 

The  following  important  theorem  holds. 

Theorem  9.2  The  function  h  :  T  —*  O(ro)  is  surjective  and  monotonic,  that  is,  for  every 
two  finite  tree  s,  t,  li  s  t,  then  h{s)  <  h{t). 

Proof  (sketch).  The  fact  that  h  is  surjective  follows  directly  from  the  recursive  definition 
shown  in  lemma  8.4.  Note  that  clause  (1)  and  (4)  are  not  needed  for  showing  that  h  is  a 
surjection,  but  they  are  needed  to  ensure  that  h  is  well  defined  and  preserves  order.  By 
clause  (0),  h{t)  =  0,  for  the  one-node  tree  t.  Clause  (2)  is  used  when  j  =  (3  +  a,  with 
Ck,l3  <  j  and  a  <  /3.  Clause  (3)  is  used  when  7  =  ip{a,l3)  with  q,^  <  7  and  a  <  /3,  and 
clause  (5)  is  used  when  7  =  ip{l3,a)  with  a,/?  <  7  and  a  <  (3. 

The  proof  that  if  s  :;<  t,  then  h{s)  <  h{t)  proceeds  by  cases,  using  induction  on 
trees,  corollary  6.30,  and  lemma  8.3.  The  only  delicate  case  arises  when  rank{s)  —  2, 
rank{t)  =  3,  and,  assuming  that  hit/l)  >  h{t/2)  >  h{t/d)  and  h{s/l)  >  h{s/2)^  we  have 
h{t/\)  —  ip{h(t/2),  h{t/l)),  s/1  ■<1/1  and  s/2  ■<  t/2.  By  the  induction  hypothesis,  h[s/\)  < 
h{t/\)  and  h{s/2)  <  h{t/2),  and  since  h{s)  =  h{s/l)  -f  h{s/2)  and  h{t)  =  h{t/l)  -f  h{t/2), 
we  have  h(s)  <  h{t).  If  h{t/l)  <  ip{h{t/2),h{t/l)),  then  h{t)  =  (p{h(t/2),  h(t/l)),  and  by 
proposition  8.3,  h(s)  =  h{s/l)  -f  h{s/2)  <  h{t/\)  +  h{t/2)  <  (p{h{t/2),  h{t/l))  =  h{t).  The 
other  cases  are  left  to  the  reader.  □ 

Theorem  9.2  implies  that  there  exist  total  orderings  of  order  type  Fq  extending  the 
partial  order  on  (finite)  trees.  DeJongh  and  Parikh  [6]  proved  that  the  maximum  (sup) 
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of  all  the  total  extensions  is  attained,  and  they  computed  the  maximum  for  certain  of  the 
(Higman)  orderings.  The  ordinals  associated  with  various  orderings  on  trees  arising  in  the 
theory  of  rewriting  systems  have  been  investigated  by  Dershowitz  and  Okada  [9],  Okada 
and  Takeuti  [38],  and  Okada  [37,  39,  40). 

Theorem  9.2  also  has  the  following  important  corollary. 

Lemma  9.3  Kruskal’s  theorem  implies  that  O(ro)  is  well-ordered  by  <. 

Proof .  Assume  that  there  is  some  infinite  sequence  of  ordinals  in  C>(ro)  such  that 

Q,-|.i  <  di  for  all  t  >  1.  By  theorem  9.2,  since  h  is  surjective,  there  is  an  infinite  sequence  of 
trees  (ti),>i  such  that  /((/,)  =  Qi  for  all  i  >  1.  By  Kruskal’s  theorem  (theorem  4.12),  there 
exist  >  0  such  that  ?  <  j  and  ti  -<  tj.  By  theorem  9.2,  we  have  o,  =  h(ti)  <  h{tj)  =  oy, 
contradicting  the  fact  that  Qj  <  Qi.  Hence,  O(ro)  is  well-ordered  by  <.  □ 

Let  us  denote  by  H^O(ro)  the  property  that  C>(ro)  is  well-ordered  by  <,  and  by 
WQO{T)  the  property  that  the  embedding  relation  is  a  wqo  on  the  set  T  of  finite  trees. 
WQO{T)  is  a  formal  statement  of  Kruskal’s  theorem. 

For  every  formal  system  <5,  if  the  proof  that  {WQO{T)  D  TFO(ro))  (given  in  lemma 
9.3)  can  be  formalized  in  S  and  kF(9(ro)  is  not  provable  in  5,  then  WQO{T)  is  not  provable 
in  S.  In  the  next  section,  we  briefly  describe  some  subsystems  of  2"‘^-order  arithmetic  in 
which  Kruskal’s  theorem  and  its  miniature  versions  are  not  provable. 

10  The  Subsystems  ACAq^  ATRq,  I{\-CAq^  of  Second-Order  Arith¬ 
metic 

Harvey  Friedman  has  shown  that  kFO(ro)  is  not  provable  in  some  relatively  strong  sub¬ 
systems  of  2”‘^-order  arithmetic,  and  therefore,  Kruskal’s  theorem  is  not  provable  in  such 
systems.  Friedman  also  proved  similar  results  for  some  finite  (first-order)  miniaturizations 
of  Kruskal’s  theorem.  In  particular,  these  first-order  versions  of  Kruskal’s  theorem  are  not 
provable  in  Peano’s  arithmetic,  since  transfinite  induction  up  to  eo  is  not  provable  in  Peano’s 
arithmetic,  due  to  a  result  of  Gentzen.  We  now  provide  some  details  on  these  subsystems 
of  2”‘^-order  arithmetic. 

Second-order  arithmetic  can  be  formulated  over  a  two-sorted  language  with  number 
variables  (m,  n, . . .)  and  set  variables  {X,  F, . . .)  .  We  define  numerical  terms  as  terms  built 
up  from  number  variables,  the  constant  symbols  0,  1,  and  the  function  symbols  -|-  (addition) 
and  •  (multiplication).  An  atomic  formula  is  either  of  the  form  ti  =  t2,  or  ti  <  <2,  or  ti  G  X, 
where  ti  and  t2  are  numerical  terms.  A  formula  is  built  up  from  atomic  formulae  using 
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A,  V,  D,  =,  -I,  number  quantifiers  Vn,  3n,  and  set  quantifiers  VX,  BX".  We  say  that  a  formula 
is  arithmetical  iff  it  does  not  contain  set  quantifiers. 

All  systems  of  second-order  arithmetic  under  consideration  include  standard  axioms 
stating  that  (N,0, 1,  -t-,  ■,  <)  is  an  ordered  semi-ring.  The  real  power  of  a  system  of  second- 
order  arithmetic  is  given  by  the  form  of  its  induction  axioms,  and  the  form  of  its  compre¬ 
hension  axioms. 

For  the  systems  under  consideration,  the  induction  axiom  is 

[0  G  X  A  Vm(m  €  X  D  m  -|-  1  E  X)]  D  Vn(n  E  A'), 

where  X  is  a  set  variable.  This  form  of  induction  is  often  called  restricted  induction,  in 
contrast  with  the  principle  of  full  induction  stated  as 

[^(0)  A  Vm((^(m)  D  ^p{m  -|-  1))]  3  Vn^(n), 

where  ip  is  an  arbitrary  2"'^-order  formula.  Apparently,  Friedman  initiated  the  study  of 
subsystems  of  2"'^-order  arithmetic  with  restricted  induction  (this  explains  the  subscript  0 
after  the  name  of  the  systems  ACA,  ATR,  or  I[\-CA). 

The  system  \[\^-CAq,  also  known  as  Z2,  or  second- order  arithmetic,  has  comprehen¬ 
sion  axioms  of  the  form 

3XVn(n  E  X  s  (^(n)), 

where  ip  is  any  2"‘^-order  formula  ip  in  which  X  is  not  free.  This  is  a  very  powerful  form  of 
comprehension  axioms.  Susbystems  of  Z2  are  obtained  by  restricting  the  class  of  formulae 
for  which  comprehension  axioms  hold. 

The  system  ACAq  is  obtained  by  restricting  the  comprehension  axioms  to  arithmetical 
formulae  in  which  X  is  not  free  {ACA  stands  for  Arithmetical  Comprehension  Axioms).  It 
turns  out  that  ACAq  is  a  conservative  extension  of  (first-order)  Peano  Arithmetic  (PA).  A 
weak  form  of  Konig’s  lemma  is  provable  in  ACAq,  and  a  fairly  smooth  theory  of  continuous 
functions  and  of  sequential  convergence  can  be  developed.  For  example,  Friedman  proved 
that  the  Bolzano/ Weierstrass  theorem  (every  bounded  sequence  of  real  numbers  contains 
a  convergent  subsequence)  is  provable  in  ACAq.  In  fact,  Friedman  proved  the  stronger 
result  that  no  set  existence  axioms  weaker  than  those  of  ACAq  are  sufficient  to  establish 
the  Bolzano/ Weierstrass  theorem.  For  details,  the  reader  is  referred  to  Simpson  [48]. 

The  system  ATRq  contains  axioms  stating  that  arithmetical  comprehension  can  be 
iterated  along  any  countable  well  ordering  {ATR  stands  for  Arithmetical  Transfinite  Recur¬ 
sion).  A  precise  formulation  of  the  axiom  ATR  can  be  found  in  Friedman,  McAloon,  and 
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Simpson  [16]  (see  also  Feferman  [14]),  but  it  is  not  essential  here.  The  system  ATRq  permits 
a  convenient  development  of  a  large  part  of  ordinary  mathematics,  including,  the  theory  of 
continuous  functions,  the  Riemann  integral,  the  theory  of  countable  fields,  the  topology  of 
complete  separable  metric  spaces,  the  structure  theory  of  separable  Banach  spaces,  a  good 
theory  of  countable  well  orderings,  Borel  sets,  analytic  sets,  and  more. 

The  system  11]- 6*^40  is  obtained  by  allowing  comprehension  axioms  in  which  (p  is  any 
n]-formula  in  which  X  is  not  free.  This  is  a  system  even  stronger  that  ATRq,  whose  axioms 
imply  many  mathematical  results  in  the  realm  of  algebra,  analysis,  classical  descriptive  set 
theory,  and  countable  combinatorics. 

The  systems  AC  A,  ATR  and  Ti\-CA  allow  full  induction  rather  than  restricted  in¬ 
duction.  It  might  be  interesting  to  mention  that  the  least  ordinals  for  which  transfinite 
induction  cannot  be  proved  in  ACAq  and  ATRq  are  respectively  eo  and  To-  Such  an  ordinal 
has  also  be  determined  for  Hj-CAo,  but  the  notation  system  required  to  describe  it  is  be¬ 
yond  the  scope  of  this  paper.  In  contrast,  the  least  ordinals  for  which  transfinite  induction 
cannot  be  proved  in  ^46*^4  and  ATR  are  respectively  and  Fe,,. 

We  now  return  to  the  connections  with  Fq  and  Kruskal’s  theorem.  Friedman  has 
shown  that  WO{Tq)  is  not  provable  in  ATRq  (Friedman,  McAloon,  and  Simpson  [16]).  He 
also  showed  that  (WQO(T)  D  IFO(Fo))  is  provable  in  ACAo-  Since  ACAo  is  a  subsystem 
of  ATRo,  we  conclude  that  WQO{T)  is  not  provable  in  ATRo-  This  is  already  quite  re¬ 
markable,  considering  that  a  large  part  of  ordinary  mathematics  can  be  done  in  ATRq.  But 
Friedman  also  proved  that  the  miniature  version  LW QO{T)  of  Kruskal  theorem  given  in 
theorem  5.1  is  not  provable  in  ATRq,  an  even  more  remarkable  result.  The  proof  of  this 
last  result  is  given  in  Simpson  [47]. 

There  is  one  more  “tour  de  force”  of  Friedman  that  we  have  not  mentioned!  Harvey 
Friedman  has  formulated  an  extension  of  the  miniature  version  of  KruskaFs  theorem  (using  a 
gap  condition),  and  proved  that  this  version  of  KruskaFs  theorem  is  not  provable  in  H]  -CAq. 
The  proof  can  be  found  in  Simpson  [47].  There  are  also  some  connections  bewteen  this  last 
version  of  KruskaFs  theorem  and  certain  ordinal  notations  due  to  Takeuti  known  as  ordinals 
diagrams.  These  connections  ae  investigated  in  Okada  and  Takeuti  [38],  and  Okada  [39, 
40]. 

11  A  Brief  Introduction  to  Term  Orderings 

This  section  is  a  brief  introduction  to  term  orderings.  These  orderings  play  an  important 
role  in  computer  science,  because  they  are  the  main  tool  for  showing  that  sets  of  rewrite 
rules  are  finite  terminating  (Noetherian).  In  turn,  Noetherian  sets  of  rewrite  rules  play  a 
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fundamental  role  in  automated  deduction  in  equational  logic.  Indeed,  one  of  the  major 
techniques  in  equational  logic  is  to  complete  a  given  set  of  equations  E  to  produce  an 
equivalent  set  R  of  rewrite  rules  which  has  some  “good”  properties,  namely  to  be  confluent 
and  Noetherian.  A  number  of  procedures  that  attempt  to  produce  such  a  set  R  of  rewrite 
rules  from  a  set  E  of  equations  have  been  designed.  The  first  such  procedure  is  due  to 
Knuth  and  Bendix  [27],  but  there  are  now  many  kinds  of  completion  procedures.  For  more 
details  on  completion  procedures,  we  refer  the  reader  to  Dershowitz  [11]  and  Bachmair  [2]. 

There  are  many  classes  of  term  orderings,  but  an  important  class  relevant  to  our  con¬ 
siderations  is  the  class  of  simplification  orderings,  because  Kruskal’s  theorem  can  be  used  to 
prove  the  well-foundedness  of  these  orderings.  For  a  comprehensive  study  of  term  orderings, 
the  reader  is  referred  Dershowitz’s  excellent  survey  [7]  and  to  Dershowitz’s  fundamental  pa¬ 
per  [8]. 

Given  a  set  of  labels  E,  the  notion  of  a  tree  was  defined  in  definition  4.2.  When 
considering  rewrite  rules,  we  usually  assume  that  E  is  a  ranked  alphabet,  that  is,  that  there 
is  a  ranking  function  r  :  E  — >  N  assigning  a  natural  number  r(/),  the  rank  (or  ariiy)  of  /, 
to  every  /  €  E.  We  also  have  a  countably  infinite  set  X  of  variables,  with  r{x)  =  0  for  every 
a:  G  A",  and  we  let  Tv;(A)  be  the  set  of  all  trees  (also  called  T,-term3,  or  terms)  t  G  T^\jx 
such  that,  for  every  tree  address  u  G  dom{t),  r{t{u))  =  rank{t/u).  In  other  words,  the  rank 
of  the  label  of  u  is  equal  to  the  rank  of  t/u  (see  definition  4.3),  the  number  of  immediate 
successors  of  u. 

Given  a  tree  t,  we  let  V ar(t)  =  G  A  |  G  dom{t),  t{u)  =  x}  denote  the  set  of 
variables  occurring  in  t.  A  ground  term  f  is  a  term  such  that  V ar{t)  —  0. 

Definition  11.1  A  set  of  rewrite  rules  is  a  binary  relation  R  C  Te(A)  x  T^{X)  such  that 
V ar{r)  C  V ar{l)  whenever  (/,r)  G  R. 

A  rewrite  rule  {I,  r)  G  is  usually  denoted  as  /  — >  r.  The  notions  of  tree  replacement 
and  substitution  are  needed  for  the  definition  of  the  rewrite  relation  induced  by  a  set  of 
rewrite  rules. 

Definition  11.2  G  iven  two  trees  ti  and  <2  and  a  tree  address  u'mti,  the  residt  of  replacing 
t2  at  u  in  ti,  denoted  by  <i[u  *—  <2],  is  the  function  whose  graph  is  the  set  of  pairs 

{(v,ti{v))  I  V  G  dom(ti),  u  is  not  a  prefix  of  u}  U  {{uv,t2{'<->))  \  v  G  dom{t2)}. 

Definition  11.3  A  substitution  is  a  function  <7  :  A  — >  T^IX),  such  that,  cr{x)  x  for  only 
finitely  many  x  G  A.  Since  Ts(A)  is  the  free  E-algebra  generated  by  A',  every  substitution 
u  :  A  — >  T^{X)  has  a  unique  homomorphic  extension  a  :  Tv(A^)  Tx:(A).  In  the  sequel, 
we  will  identify  a  and  its  homomorphic  extension  a,  and  denote  a{t)  as  t[a\. 
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Definition  11.4  Given  a  substitution  a,  the  domain  of  a  is  the  set  of  variables  D{a)  = 
{x  I  (t(x)  ^  x}.  Given  a  substitution  a,  if  its  domain  is  the  set  {xi , . . . ,  x^},  and  if  =  cr(xi), 
1  <  i  <  n,  then  a  is  also  denoted  by  [ti/xi, . . .  ,t„/x„]. 

Definition  11.5  A  substitution  <t  is  a  renaming  iff  cr(x)  is  a  variable  for  every  x  G  D(a), 
and  <7  is  injective.  Let  R  C  T'^{X)  x  T£(A’)  be  a  set  of  rewrite  rules.  A  rewrite  rule 
s  t  is  a.  variant  of  a  rewrite  rule  u  —*  v  E  R  if(  there  is  some  renaming  p  with  domain 
V ar(u)  U  V ar{v)  such  that  s  =  u[p]  and  t  =  v[p]. 

Definition  11.6  Let  — >  be  a  binary  relation  — >  C  x  Te(A’).  (i)  The  relation  — > 

is  monotonic  (or  stable  under  the  algebra  structure)  iff  for  every  two  terms  s,t  and  every 
function  symbol  /  €  E,  if  s  — >  t  then  /(...,  s, .. .)  — >  /(...,  t, .. .). 

(ii)  The  relation  — ^  is  stable  (under  substitution)  if  s  — >  t  implies  sfa]  — >  t\cr\  for 
every  substitution  a. 

Definition  11.7  Let  R  C  T^(X)  x  Ts(A’)  be  a  set  of  rewrite  rules.  The  relation  — >r 
over  T-£,{X)  is  defined  as  the  smallest  stable  and  monotonic  relation  that  contains  R.  This 
is  the  rewrite  relation  associated  with  R. 

This  relation  is  defined  explicitly  as  follows:  Given  any  two  terms  t\,t2  G  then 

t\  — ^2 

iff  there  is  some  variant  /  — >  r  of  some  rule  in  i2,  some  tree  address  a  in  and  some 
substitution  a,  such  that 


ti/a  =  /[cr],  and  <2  =  tifo  <—  »'[o']]. 

We  say  that  a  rewrite  system  R  is  Noetherian  iff  the  relation  — associated  with  R 
is  Noetherian. 

Now,  our  goal  is  to  describe  some  orderings  that  will  allow  us  to  prove  that  sets  of 
rewrite  rules  are  Noetherian.  First,  it  is  convenient  to  introduce  the  concept  of  a  strict 
ordering. 

Definition  11.8  A  strict  ordering  (or  strict  order)  on  a  set  A  is  a  transitive  and 
irreflexive  relation  (for  all  a,  a  a.) 

Given  a  preorder  (or  partial  order)  on  a  set  A,  the  strict  ordering  -<  associated  with 
is  defined  such  that  s  t  iS  s  :<  t  and  t  s.  Conversely,  given  a  strict  ordering  -X, 
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the  partial  ordering  ::<  associated  with  X  is  defined  such  that  s  :<  t  s  ■<  t  or  s  =  t.  The 
converse  of  a  strict  ordering  -<  is  denoted  as 

We  now  introduce  the  important  concepts  of  simplification  ordering,  and  reduction 
ordering.  Let  E  be  a  set  of  labels  (in  most  cases,  a  ranked  alphabet). 

Definition  11.9  A  strict  order  -<  on  satisfying  conditions 

(1)  s  -^  /(. . .  ,s, . . .),  and 

(2)  /(...)x/(...,s,...), 

is  said  to  have  the  subterm  property  and  the  deletion  property. 

A  simplification  ordering  is  a  strict  ordering  that  is  monotonic  and  has  the  subterm 
and  deletion  property.^ 

A  reduction  ordering  X  is  a  strict  ordering  that  is  monotonic,  stable  under  substitution, 
and  such  that  X  is  well-founded. 

With  a  slight  abuse  of  language,  we  will  also  say  that  the  converse  of  a  strict  ordering 
-<  is  a  simplification  ordering  (or  a  reduction  ordering).  The  importance  of  term  orderings 
is  shown  by  the  next  fundamental  result. 

Lemma  11.10  A  set  of  rules  R  is  Noetherian  if  and  only  if  there  exists  a  reduction 
ordering  on  Ty:{A)  such  that  /  X  r  for  every  I  r  e  R. 

Unfortunately,  it  is  undecidable  in  general  if  an  arbitrary  system  R  is  Noetherian 
since  it  is  possible  to  encode  Turing  machines  using  a  system  of  two  rewrite  rules,  and  this 
would  imply  the  decidability  of  the  halting  problem  (see  Dershowitz  [7]).  The  importance 
of  simplification  orderings  is  shown  by  the  next  theorem. 

Theorem  11.11  (Dershowitz)  If  S  is  finite,  then  every  simplification  ordering  on  Te  is 
well-founded. 

Proof .  This  is  a  consequence  of  proposition  4.8,  which  uses  Kruskal’s  tree  theorem.  □ 

In  practice,  we  want  theorem  11.11  to  apply  to  simplification  orderings  on  but 

since  X  is  infinite,  there  is  a  problem.  However,  we  are  saved  because  we  usually  only  care 
about  terms  arising  in  derivations. 


1 


When  S  is  a  ranked  alphabet,  the  deletion  property  is  superfluous. 
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Definition  11.12  An  ordering  X  is  well-founded  for  derivations  iff  X  ft  -^r  is  well- 
founded  for  every  finite  rewrite  system  R. 

Since  V ar{r)  C  V ar{l)  for  every  I  r  ^  R,  every  derivation  of  a  finite  rewrite  system 
involves  only  finitely  many  symbols.  Thus,  as  corollary  of  the  above  theorem  we  have: 

Corollary  11.13  (Dershowitz)  Every  simplification  ordering  is  well-founded  for  deriva¬ 
tions. 

Warning:  There  exists  rewrite  systems  whose  termination  cannot  be  shown  by  any 
total  simplification  ordering  as  shown  by  the  following  example. 

Example  11.14 

/(«)  -  m 

g(b)  ->  g{a) 

Next,  we  are  going  to  describe  two  important  classes  of  simplification  orderings,  the 
recursive  path  ordering,  and  the  lexicographic  path  ordering.  But  first,  we  need  to  review 
the  definitions  of  the  lexicographic  ordering  and  the  multiset  ordering. 

Definition  11.15  Given  n  partially  ordered  sets  (Si,  -<,)  (where  each  -<i  is  a  strict  order, 
n  >  1),  the  lexicographic  order  -<iex  on  the  set  Si  x  •  ■  •  x  S„  is  defined  as  follows.  Let 
(oi ,  ...  ,  On)  and  (6i ,  ...  ,  bn)  be  members  of  Si  x  •  •  •  x  Sn.  Then 

(oi,  ...  ,tln)  ~^lex  {hi-,  ,  bn) 

if  and  only  if  there  exists  some  i,  1  <  f  <  n,  such  that  Oi  -<i  bi,  and  aj  —  bj  for  all  j, 

I  <  j  <  i. 

We  now  turn  to  multiset  orderings.  Multiset  orderings  have  been  investigated  by 
Dershowitz  and  Manna  [10],  and  Jouannaud  and  Lescanne  [24]. 

Definition  11.16  Given  a  set  A,  a  multiset  over  A  is  an  unordered  collection  of  elements 
of  A  which  may  have  multiple  occurrences  of  identical  elements.  More  formally,  a  multiset 
over  A  is  a  function  M  :  A  — >  N  (where  N  is  the  set  of  natural  numbers)  such  that  an 
element  a  G  A  has  exactly  n  occurrences  in  M  iff  M{a)  =  n.  In  particular,  a  does  not 
belong  to  M  when  M(a)  =  0,  and  we  say  that  a  G  M  iff  M{a)  >  0. 

The  union  of  two  multisets  Mi  and  M2,  denoted  by  Mi  UM2,  is  defined  as  the  multiset 
M  such  that  for  all  a  G  A,  M(a)  =  Mi  (a)  +  M2(a). 
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Let  (5,  -<)  be  a  partially  ordered  set  (where  X  is  a  strict  order),  let  M  be  some  finite 
multiset  of  objects  from  S,  and  finally  let  n,  ...  ,n'f.  6  S.  Define  the  relation  <=m  on 
finite  multisets  as 

M  U  {n'j,  . . .  MU{n}, 

where  k  >  0  and  n'-  -<  n  for  all  i,  1  <  i  ^  k. 

The  multiset  ordering  is  simply  the  transitive  closure 

In  other  words,  N'  -<m{S)  ^  iff  is  produced  from  a  finite  multiset  N  by  removing 
one  or  more  elements  and  replacing  them  with  any  finite  number  of  elements,  each  of  which  is 
strictly  smaller  than  at  least  one  element  removed.  For  example,  {4, 4,  3,  3, 1}  {5,3, 1, 1}, 

where  is  the  multiset  ordering  induced  by  the  ordering  <  of  the  natural  numbers. 

It  is  easy  to  show  that  for  any  partially  ordered  set  (5,  :<),  we  have  associated  partially 
ordered  sets  (M(S),  (where  M{S)  is  the  set  of  all  finite  multisets  of  members  of  S), 

and  (5",::<;ei)  for  n  >  0.  Furthermore  is  total  (respectively,  well-founded)  iff  :<;ei  (for 
any  n)  is  total  (respectively,  well-founded). 

Using  Konig’s  lemma,  we  can  also  show  the  following  useful  result. 

Lemma  11.17  If  ■<  is  well-founded  (respectively,  total)  on  S,  then  ^a^(S)  is  well-founded 
(respectively,  total)  on  M(S). 

There  is  an  interesting  connection  between  the  multiset  ordering  and  ordinal  expo¬ 
nentiation.  Given  a  well  ordering  :<  on  a  set  5,  it  is  well  know  that  there  is  a  unique  ordinal 
a  and  a  unique  order-preserving  bijection  ip  :  S  a. 

The  connection  is  that  (M(5),  is  order-isomorphic  to  Indeed,  the  function 

V’:  M{S)  — >  Lij°  defined  such  that  V’(0)  =  0,  and 

V^({mi , . . .  ,m,})  =  -F  •  •  • -f 

where  p{mi)  >  . . .  >  is  the  nonincreasing  sequence  enumerating  . . .  ,mfc}),^ 

is  easily  shown  to  be  an  order-isomorphism. 

The  lexicographic  ordering  and  the  multiset  ordering  can  also  be  defined  for  preorders. 
This  generalization  will  be  needed  for  defining  rpo  and  Ipo  orderings  based  on  preorders. 

Definition  11.18  G  iven  n  preordered  sets  (S^,:^,)  (n  >  1),  the  lexicographic  preorder 
diiex  on  the  set  5i  x  •  •  ■  x  5„  is  defined  as  follows: 

(oi ,  ...  ,On)  Z^/er  {^li  •••  7^n) 

^  In  the  theory  of  ordinals,  the  sum  u)*^^”**  *  -f  •  •  -  )  jg  a  natural  sum. 


Draft/ September  30,  1993 


11  A  Brief  Introduction  to  Term  Orderings 


53 


if  and  only  if  there  exists  some  1  <  i  <  w,  such  that  a,-  bi,  and  aj  bj  for  all  j, 
l<j  < 

Definition  11.19  Let  (5,  ■<)  be  a  preordered  set,  let  M  be  some  finite  multiset  of  objects 
from  5,  and  finally  let  n,  n'j,  ...  ,n'^  €  S.  Define  the  relation  <=m  on  finite  multisets  as 

M  U{nj,  ...  ■<T=m  ^  U  {n} , 

where  either  k  =  1  and  n  Ri  n'j ,  or  /:  >  0  and  n'-  -<  n  for  all  i,  I  <  i  <  k.*^ 

The  multiset  preorder  diM{S)  is  the  transitive  closure 

Two  finite  multisets  M\  and  M2  are  equivalent  {M\  ~m{S)  ^2)  iff  they  have  the  same 
number  of  elements,  and  every  element  of  Mi  is  equivalent  to  some  element  of  M2  and  vice 
versa.  It  is  easy  to  show  that  for  any  preordered  set  (5,  :<)  we  have  associated  preordered 
sets  (M(5),  :<M(S))  (where  M(5)  is  the  set  of  all  finite  multisets  of  members  of  5),  and 
{S"',:<lex)  for  n  >  0.  Furthermore  is  total  (respectively,  well-founded)  iff  ::</ex  (for  any 
n)  is  total  (respectively,  well-founded). 

Using  Konig’s  lemma,  we  can  also  show  that  lemma  11.17  holds  for  preorders. 

Lemma  11.20  If  ^  is  a  well-founded  preorder  (respectively,  total)  on  S,  then  d:M(S)  is 
well-founded  (respectively,  total)  on  M{S). 

A  naive  ordering  on  terms  based  on  the  notion  of  lexicographic  order  is  as  follows. 

For  any  given  ordering  X  on  E  we  say  that 

•s  =  /(^i,  •••  ,3„)  g{ti,  ...  ,tm)  =  t 


iff  either 

(i)  f  ^  9\ov 

(ii)  f  =  g  and  (si,  ...  ,«„)  (ti,  ...  ,fn), 

where  is  the  lexicographic  extension  of  to  n-tuples  of  terms  (the  success  of  this 

recursive  definition  depends  on  the  fact  that  we  use  the  lexicographic  extension  over  terms 
smaller  than  s  and  t). 

It  is  easy  to  show  by  structural  induction  on  terms  that  ilex  is  total  on  ground  terms 
whenever  the  is  total  on  S,  but  it  has  a  severe  defect:  it  is  not  well-founded.  For  example, 

^  As  usual,  the  equivalence  «  associated  with  a  preorder  ■<  is  defined  such  that  a  «  6  iff  a  ^  6  and 
6^0. 

^  As  usual,  given  a  preorder  the  strict  order  -<  is  defined  such  that  a  -<,  b  iff  a  b  and  b  a. 
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ii  a  y  f  then  we  have  a  fa  pa  - The  problem  arises  since  it  is  possible 

for  a  term  to  be  strictly  smaller  than  one  of  its  subterms. 

The  most  powerful  forms  of  reduction  orderings  are  based  on  the  relative  syntactic 
simplicity  of  two  terms,  i.e.,  on  the  notion  of  a  simplification  ordering.  Although  there  are 
many  types  of  simplification  orderings,  one  of  the  most  elegant  and  useful  is  the  recursive 
path  ordering,  for  short,  rpo. 

Definition  11.21  Let  be  a  preorder  on  E.  The  recursive  path  ordering  on  Te(A'), 
for  short,  rpo,  is  defined  below.  Actually,  we  give  a  simultaneous  recursive  definition  of 
'^rpoi  y rpot  and  ^^rpo'i  where  s  y rpo  t  iff  s  '^rpo  t  and  s  ^rpo  t,  and  s  ^rpo  t  iff  s  ^rpo  t  and 

^  ^rpo  t. 

Then,  f{s\,  ...  ,Sn)  yrpo  Q{t\,  ...  ,f„, )  holds  iff  one  of  the  conditions  below  holds: 

(i)  f  Kg  and  {si,  ...  ,s„}  {t,,  ...  or 

(ii)  f  ^  9  3-nd  /(si,  ...  ,Sn)  yrpo  t,  for  all  i,  1  <  i  <  m;  or 

(iii)  Si  ^rpo  g(ti,  ...  Cm)  for  some  i,  I  <i  <ri, 

where  'y^^P  is  the  extension  of  ^rpo  to  multisets,® 

Note  that  since  the  preorder  ^  is  only  defined  on  E,  variables  are  regarded  as  incom¬ 
parable  symbols.  In  (ii),  the  purpose  of  the  condition  f[s\,  ...  ,s„)  yrpo  ft  for  all  i,  is  to 
insure  that  /(sj ,  . . .  ,  gCi ,  ...  ,t^). 

Theorem  11.22  (Dershowitz,  Lescanne)  The  relation  yrpo  is  a  simplification  ordering 
stable  under  substitution.  Furthermore,  if  the  strict  order  X  is  well-founded  on  E,  then 
yrpo  is  well-founded,  even  when  E  is  infinite. 

Proof  sketch.  Proving  that  rpo  is  a  simplification  ordering  is  laborious,  especially  transi¬ 
tivity.  The  complete  proof  can  be  found  in  Dershowitz  [8].  In  order  to  prove  that  yrpo  is 
well-founded  when  X  is  well-founded  on  E,  it  is  tempting  to  apply  proposition  4.8  to  the 
preorders  <C  and  :<rpo,  where  is  defined  such  that  s  f  iff  root(s)  root(t),  since  the 
conditions  of  this  lemma  hold.  Unfortunately,  :<  is  not  a  wqo.  However,  we  can  use  the 
idea  from  theorem  4.10  to  extend  to  a  total  well-founded  ordering  <.  Then,  by  theorem 
4.7,  the  embedding  preorder  ::<<  induced  by  <  (see  definition  4.6)  is  a  wqo,  and  thus,  it  is 
well-founded.  We  can  now  apply  proposition  4.8,  which  shows  that  <rpo  (the  rpo  induced 
by  <)  is  well-founded.  Finally,  we  prove  by  induction  on  terms  that  <rpo  contains  :<rpo, 
which  proves  that  yrpo  itself  is  well-founded.  □ 

®  Other  authors  define  as  the  multiset  extension  of  the  strict  order  >rpo,  and  s  t  iff 

s  t  or  s  =  t.  Our  definition  is  more  general. 
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A  proof  not  involving  Kruskal’s  theorem,  but  using  Zorn’s  lemma,  is  given  in  Lescanne 
[29].  Of  course,  a  strict  order  on  a  finite  set  is  always  a  wqo,  and  the  significance  of  the 
second  part  of  the  theorem  is  that  it  holds  even  when  S  is  infinite. 

Example  11.23  Consider  the  following  set  of  rewrite  rules  to  convert  a  proposition  to 
disjunctive  normal  form: 

-<{P  V  Q)  — >  -'P  A  ->(5, 

-(p  A  g)  --py^Q, 
p  A  (g  V  p)  — .  (p  A  g)  V  (p  A  p), 

(P  V  g)  A  p  — » (p  A  p)  V  (g  A  p), 

--P  — .  p, 

P  V  P - ^  P, 

PAP - ^  P. 

This  system  can  be  easily  shown  to  be  Noetherian  using  the  rpo  induced  by  the  following 
ordering  on  the  set  of  operators:  W . 

It  is  possible  to  show  that  ^rpo  is  total  on  ground  terms  whenever  is  total  on 
E.  It  is  also  possible  to  define  reduction  orderings  which  are  total  on  ground  terms;  the 
problem  with  hrpo  is  that  it  is  not  a  partial  order  in  general,  but  only  a  preorder,  i.e.,  the 
equivalence  relation  «rpo  is  not  necessarily  the  identity.  For  example,  for  any  >-  we  have 
/(a,  b)  ^rpo  f{b,  a)  but  clearly  /(a,  b)  ^  f{b,  a).  It  is  easy  to  show  by  structural  induction 
on  terms,  and  using  only  clause  (i)  of  the  definition  of  rpo  that  for  any  two  ground  terms 
s  —  ...  ,5,j)  and  t  =  ...  ,^^1)5  have  s  ~rpo  ^  iff  f  ^  9  ^•iid  ~rpo  f 7r( i) >  f®^ 

1  <  f  <  n,  where  tt  is  some  permutation  of  the  set  {1,  . . .  ,n}.  (In  other  words,  s  ~rpo  f 
iff  s  and  t  are  equal  up  to  equivalence  of  symbols,  and  up  to  the  permutation  of  the  order 
of  the  terms  under  each  function  symbol,  where  the  permutation  of  subterms  arises  by  the 
comparison  of  multisets  of  subterms  in  clause  (i)  of  the  definition.) 

This  motivates  the  following  definition. 

Definition  11.24  For  any  ordering  on  S,  let  the  term  ordering  >-rpo/  be  defined  such 
that  s  yrpoi  t  iff  either  s  )^rpo  t  or  s  and  t  are  ground,  s  «rpo  t,  and  s  t. 

Clearly  for  any  total  on  S  this  is  a  reduction  ordering  total  on  ground  terms,  since 
^rpo  is  total  on  ground  terms  and  if  s  ^rpo  t  and  s  i^rpo  t  then,  since  is  total  on 

ground  terms,  we  must  have  either  s  t  or  s  t. 

Thus,  any  time  the  underlying  ordering  on  S  is  total  we  have  a  total  ordering  on 
Ts,  even  though  the  ordering  may  not  be  total  on  This  is  a  major  problem  with 
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term  orderings:  in  order  to  preserve  stability  under  substitution,  they  must  treat  variables  as 
incomparable  symbols.  Thus  equations  such  as  commutative  axioms  (e.g.  f{x,  y)  =  f{y,  x)) 
can  never  be  oriented. 

Warning:  It  is  possible  that  for  R  and  S  rewrite  systems  on  disjoint  sets  of  function 
(and  constant)  symbols,  both  R  and  S  are  Noetherian,  but  i?  U  5  is  not,  as  shown  by  the 
following  example  due  to  Toyama. 

Example  11.25 

R  =  {/(0, 1,2)  f{z,z,z)} 

S  =  {g(^,y)  X 
9{x,y)  y} 

Observe  that  the  term  /(y(0, 1),  y(0, 1),  y(0, 1))  rewrites  to  itself; 

/(y(0,l),y(0,l),y(0,l))  -^/(0,y(0,l),y(0,l)) 

-^/(0,l,y(0,l)) 

— ./(y(0,l),y(0,l),y(0,l)). 

Another  interesting  kind  of  term  ordering  is  the  lexicographic  path  ordering  due  to 
Kamin  and  Levy. 

Definition  11.26  Let  X  be  a  preorder  on  S.  The  lexicographic  path  ordering  on 
for  short,  Ipo,  is  defined  below.  Actually,  we  give  a  simultaneous  recursive  definition 
of  hipo,  yipo,  and  ^ipo,  where  s  ^ipo  f  iff  s  ^ipo  t  and  s  t,  and  s  ^tpo  t  iff  s  ^ipo  t  and 

S  ‘^Ipo  t. 

Then,  f{si,  ...  ,s„)  ^ipo  9{ti,  ■■■  ,tm)  holds  iff  one  of  the  conditions  below  holds; 

(i)  f  ~  9 1  ^1  ^ipo  1  '^ipo  ^.ipo  lit  and  s  ^  ipo  li-i-i  t  •  •  •  t  ^  ^  Ipo  Int  for 

some  i,  1  <  f  <  n,  with  s  =  f{si ,s„)  and  m  =  n;  or 
(ii)  f  y  9  and  /(si,  . . .  ,Sn)  yipo  ti  for  all  i,  1  <  i  <  m;  or 
(hi)  Si  yipo  y(ti,  •  •  •  tlm)  for  some  i,  1  <  i  <  n. 

Note  that  since  the  preorder  ■<  is  only  defined  on  E,  variables  are  regarded  as  incom¬ 
parable  symbols.  Also,  condition  (i)  is  sometimes  stated  as; 

(i’)  /  ~  9t  (51,..., 5„)  y'lll  (fi,...,tn),  m  =  n,  and  /(si,...,Sn)  yip^  f,  for  all  i, 
1  <  i  <  n,  where  b/po  lexicographic  extension  of  ^ipo  on  n-tuples.® 

®  Other  authors  define  as  the  lexicographic  extension  of  the  strict  order  ^ipo,  and  s  t  iff 

s  t  or  s  =  t.  Our  definition  is  more  general. 
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It  is  easily  seen  that  (i)  and  (i’)  axe  equivalent.  In  (i),  the  purpose  of  the  conditions 
^  '^Ipo  ^»+i  i  •  •  •  ?  ^  ^  Ipo  Is  to  insure  that  ...  5.Sn)  '^ipo  •••  Iff  Sj  ipo 

Similarly,  in  (ii),  the  purpose  of  the  condition  /(^i,  ...  ,Sn)  yipo  ti  for  all  i,  is  to  insure 
that  /(^i,  ...  ,^n)  yipo  dip'll  •••  5^m). 

Theorem  11.27  (Kamin,  Levy)  The  relation  X/po  is  a  simplification  ordering  stable  under 
substitution.  Furthermore,  if  the  strict  order  X  is  well-founded  on  E,  and  equivalent  symbols 
have  the  same  rank,  then  X/po  is  well-founded,  even  when  S  is  infinite. 

Proof .  The  proof  uses  the  techniques  used  in  theorem  11.22  (Kruskal’s  theorem).  □ 

As  in  the  previous  theorem  on  rpo,  the  significance  of  the  second  part  of  the  theorem 
is  that  it  holds  even  when  E  is  infinite. 

Example  11.28  Consider  the  following  set  of  rewrite  rules  for  free  groups  (Knuth  and 
Bendix  [27]). 


(x*y)*z  — y  x*{y*z), 
1  *  X  — >  X, 

I(x)  *  X  — y  1, 

7(x)  *{x*y)  — >  y. 


X  *  1  — y  X, 

/(/(x))  — ^  X, 

X  *  I{x)  — y  1, 

X  *  (/(x)  *y)  — y  y, 

I{x*y)  — y  I{y)*I{x). 


This  system  can  be  easily  shown  to  be  Noetherian  using  the  Ipo  induced  by  the  following 
ordering  on  the  set  of  operators:  I  y  *  y  1. 

It  is  possible  to  combine  Ipo  and  rpo  (Lescanne  [32]).  It  is  also  possible  to  define 
semantic  path  orderings  (Kamin,  Levy),  as  opposed  to  the  above  precedence  orderings. 
Semantic  path  orderings  use  orderings  on  Ts  rather  than  orderings  on  E  (see  Dershowitz 

[7]). 

The  relative  strength  and  the  ordinals  associated  with  these  orderings  have  been  stud¬ 
ied  by  Okada  and  Dershowitz  [37,  9].  For  instance,  given  a  strict  ordering  -<  on  a  finite 
set  E  of  n  elements,  then  Ts  under  -<rpo  is  order-isomorphic  to  y?n(0),  the  first  n-critical 
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ordinal.^  In  particular,  there  is  a  very  natural  representation  of  the  ordinals  less  than  eo 
in  terms  of  nested  multisets  of  natural  numbers.  It  is  even  possible  to  define  an  rpo  whose 
order-type  is  To  (see  Dershowitz  [7]),  if  we  allow  terms  to  serve  as  labels.® 

Okada  has  showed  that  it  is  possible  to  combine  the  multiset  and  lexicographic  ordering 
to  obtain  term  orderings  subsuming  both  the  rpo  and  Ipo  ordering,  and  also  obtain  a  system 
of  notations  for  the  ordinals  less  than  To  (see  Okada  [37],  and  Dershowitz  and  Okada  [9]). 
Such  systems  arc  inspired  by  some  earlier  work  of  Ackermann  [l],  and  we  briefly  describe 
one  of  them. 

Let  C  be  a  set  of  constants,  and  F  a  set  of  function  symbols  (we  are  not  assuming 
that  symbols  in  F  have  a  fixed  arity). 

Definition  11.29  For  any  n  >  0,  the  set  An{F,C)  of  generalized  Ackermann  terms  is 
defined  inductively  as  follows: 

(1)  c  6  An{F,C)  whenever  c  ^  C. 

(2)  /(ti, . . .  ,t„)  €  An{F,C)  whenever  f  €  F  and  ti, . . . ,  €  .4„(F,  C). 

The  terms  defined  by  (1)  and  (2)  are  called  connected  terms. 

(3)  <!#•••  m  €  •An(F,  C),  whenever  t] ,...,  tfj,  are  connected  terms  in  An(F,  C)  (m  ^ 
2).® 

Given  a  set  E  =  C  U  F  of  labels,  note  that  the  set  of  trees  can  be  viewed  as  a 
subset  of  Ai(F,C),  using  the  following  representation  function: 

rep{c)  =  c,  when  c  ^  C ,  and 

repifiti,. . .  Am))  =  /(rep(ti)#---#rep(#„)). 

Given  a  preorder  on  C  U  F,  we  define  a  preorder  :<ack  on  A.„(F,  C)  as  follows. 

Definition  11.30  The  Ackermann  ordering  '^ack  on  An(F,  C)  is  defined  below.  Actually, 
we  give  a  simultaneous  recursively  definition  of  ^acit,  yack,  and  ^ack,  where  5  yack  i  iff 
S  yack  t  and  S  :^ack  t,  and  S  ~acfc  t  iff  5  hack  t  and  S  -hack  t. 

(1)  If  e  C,  then  s  hack  t  iff  s  hi-  K  s  e  C  and  t  ^  C\  then  t  y-ack  s  (and  t  '^ack  -s)- 

(2)  Let  s  =  f{si,...,Sn)  and  t  =  g{ti, . . . , t„).  Then,  s  hack  t  iff  one  of  the  conditions 
below  holds: 

^  In  this  case,  D  is  not  a  ranked  alphabet.  We  allow  the  symbols  in  E  to  have  varying  (finite)  ranks. 

®  These  terms  are  formed  using  a  single  symbol  ★  that  can  assume  any  finite  rank. 

®  Compared  to  the  definition  in  Dershowitz  and  Okada  [9],  we  require  that  .  .  ,tm  are  connected 
terms.  This  seems  cleaner  and  does  not  seem  to  cause  any  loss  of  generality. 
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(i)  f  ^  9 1  ~acfc  1  ~acfc  ^acfc  S-Ild  S  ack  ^ack  I'nt 

for  some  i,  1  <  i  <  n;  or 

(ii)  f  y  g  and  f{si,  . . .  ,s„)  yack  ti  for  *,  1  <  i  <  n;  or 

(iii)  Si  yack  gih,  •  ■  •  ,<n)  for  some  i,l<i<n. 

(3)  Let  s  =  (or  5  =  si)  and  t  =  (or  t  =  ti).  Then,  s  tack  t  iff 

{51,..., 5m}  tTcV'  {tu---,tp}, 


where  is  the  multiset  extension  of  tack- 

The  following  results  axe  stated  in  Okada  [37],  and  Dershowitz  and  Okada  [9]. 

Theorem  11.31  (1)  If  the  strict  order  is  well-founded  on  C  U  F,  then  yack  is  well- 

founded  on  An{F,  C). 

(2)  The  multiset  extension  of  rpo  is  identical  to  yack  on  Ai{F,C). 

Proof .  The  proof  of  (1)  uses  the  techniques  used  in  theorem  11.22  (Kruskal’s  theorem). 
The  proof  of  (2)  is  straightforward.  □ 

Equivalently,  part  (2)  of  theorem  11.31  says  that  the  restriction  of  tack  to  connected 
terms  in  Ai(F,C)  is  identical  to  rpo  (we  use  the  representation  of  terms  given  by  the 
function  rep  described  earlier). 

Finally,  as  noted  by  Okada,  (A2({V’}5  {0}),  i^ocfc)  provides  a  system  of  notations  for  the 
ordinals  less  than  Tq.  This  is  easily  seen  using  theorem  8.12.  To  show  that  tack  corresponds 
to  the  ordering  on  the  ordinals  less  than  To,  we  use  lemma  8.11  and  lemma  8.10.  We  can 
even  define  a  bijection  ord  bewteen  the  equivalence  classes  of  >12({V’},  {0})  modulo  P^^ack 
and  the  set  of  ordinals  less  than  Fq  as  follows: 

ord{ip{s,t))  =  tp{ord{s)jOrd{t)), 

or(i(si#  •  •  •  #5m)  =  «i  H - t-Om, 

where  ai  >  . . .  >  otm  is  the  sequence  obtained  by  ordering  {ord(si), . . .  ,ord{s  m  )}  in 
nonincreasing  order. 


12  A  Glimpse  at  Hierarchies  of  Fast  and  Slow  Growing  Functions 

In  this  section,  we  discuss  briefly  some  hierarchies  of  functions  that  play  an  important 
role  in  logic  because  they  provide  natural  classifications  of  recursive  functions  according  to 
their  computational  complexity.  It  is  appropriate  to  discuss  these  classes  of  functions  now. 
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because  we  have  sufficient  background  about  constructive  ordinal  notations  at  our  disposal. 
When  restricted  to  the  ordinals  less  than  Cq,  these  hierarchies  provide  natural  rate-of-growth 
and  complexity  classifications  of  the  recursive  functions  which  are  provably  total  in  Peano’s 
arithmetic.  In  particular,  for  two  of  these  hierarchies,  and  Hf^  dominate  every  such 
function  (for  all  but  finitely  many  arguments).  Thus,  the  statement  is  total  recursive” 
is  true,  but  not  provable  in  Peano’s  arithmetic.  The  relationship  with  Kruskal’s  theorem  is 
that  the  function  Fr  mentioned  in  the  discussion  following  theorem  5.2  dominates  F^^  (for 
all  but  finitely  many  arguments).  In  fact,  Fr  has  the  rate  of  growth  of  a  function  Fa  where 
Q  is  considerably  larger  that  Pq!  The  results  of  this  section  are  presented  in  Cichon  and 
Wainer  [4],  and  Wainer  [54],  and  the  reader  is  referred  to  these  papers  for  further  details. 

For  ease  of  understanding,  we  begin  by  defining  hierarchies  indexed  by  the  natural 
numbers.  There  are  three  classes  of  hierarchies. 

1.  Outer  iteration  hierarchies. 

Let  N  ^  N  be  a  given  function.  The  hierarchy  (5fm)meN  is  defined  as  follows:  For 
all  n  G  N, 


9o{n)  =  0, 

^m+i(n)  =  g{gm{n)). 

The  prime  example  of  this  kind  of  hierarchy  is  the  slow-growing  hierarchy  (Gm)meN  based 
on  the  successor  function  g(7i)  =  n  +  1.  This  hierarchy  is  actually  rather  dull  when  the 
Gm  are  indexed  by  finite  ordinals,  since  (?„(”)  =  for  all  n  G  N,  but  it  is  much  more 
interesting  when  the  index  is  an  infinite  ordinal. 

2.  Inner  iteration  hierarchies . 

Again,  let  N  — >  N  be  a  given  function.  The  hierarchy  {hm)meN  is  defined  as  follows: 
For  all  n  G  N, 


ho{n)  =  n, 

^m+i(n)  =  h„r{g{n)). 


The  prime  example  of  this  kind  of  hierarchy  is  the  Hardy  hierarchy  (i/m)meN  based  on  the 
successor  function  g{n)  =  n  +  1.  This  hierarchy  is  also  rather  dull  when  the  Hm  are  indexed 
by  finite  ordinals,  since  Hmin)  —  n-i-m  for  all  n  G  N,  but  it  is  much  more  interesting  when 
the  index  is  an  infinite  ordinal. 

3.  Fast  iteration  hierarchies. 
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Let  ^:N  ^  N  be  a  given  increasing  function.  The  hierarchy  (/m)mGN  is  defined  as 
follows:  For  all  n  G  N, 


h{n)=g{n\ 

fm+\{n)  =  /”(n), 

where  /^(x)  =  fm(fm{- ■  ■  (fm(x)) . . .)),  the  nth  iterate  of  fm  applied  to  x.  The  prime 
example  of  this  kind  of  hierarchy  is  the  Grzegorczyk  hierarchy  (FVi)m6N  based  on  the 
successor  function  g{n)  =  n  +  1.  This  hierarchy  is  not  dull  even  when  the  Fm  are  indexed 
by  finite  ordinals.  Indeed,  Fi(n)  =  2n,  F2{n)  =  2"n,  and 

2"  r  <  F^in). 

In  order  to  get  functions  growing  even  faster  than  those  obtained  so  far,  we  extend 
these  hierarchies  to  infinite  ordinals.  The  trick  is  to  diagonalize  at  limit  ordinals.  How¬ 
ever,  this  presuposes  that  for  each  limit  ordinal  a  under  consideration,  we  already  have  a 
particular  predefined  increasing  sequence  q;[0],q;[1],  . . . ,  Q;[n], . . .,  such  that  a  =  UnSN 
a  so-called  fundamental  sequence.  The  point  of  ordinal  notations  is  that  they  allow  the 
definition  of  standard  fundamental  sequences.  This  is  particularly  simple  for  the  ordinals 
less  than  Cq,  where  we  can  use  the  Cantor  normal  form. 

For  every  limit  ordinal  6  <  cq)  if  ^  ®  +  /?>  then  ^[n]  =  0-1-  /?[n],  if  6  =  ,  then 

^[n]  =  a;“n  (i.e.  u°‘  +  •  •  ■  +  n  times),  and  when  ^  =  w"  for  a  limit  ordinal  a,  then 
(5[n]  =  For  cq  itself,  we  choose  eo[0]  =  0,  and  eo[n  +  1]  = 

Fundamental  sequences  can  also  be  assigned  to  certain  classes  of  limit  ordinals  larger 
than  eo ,  but  this  becomes  much  more  complicated.  In  particular,  this  can  be  done  for  limit 
ordinals  less  than  Fq,  using  the  normal  form  representation  given  in  theorem  8.2. 

Assuming  that  fundamental  sequences  have  been  defined  for  all  limit  ordinals  in  a 
given  subclass  I  of  O,  we  extend  the  definition  of  the  hierarchies  as  follows. 

Definition  12.1  Outer  iteration  hierarchies. 

Let  N  — >  N  be  a  given  function.  The  hierarchy  {ga)aei  is  defined  as  follows:  For 
all  n  G  N, 


ffo(n)  =  0, 
gcr+i(n)  =  g(go(n)), 
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where  in  the  last  case,  a  is  a  limit  ordinal.  The  prime  example  of  this  kind  of  hierarchy 
is  the  slow-growing  hierarchy  (Gq)qgx  based  on  the  successor  function  g{n)  =  n  +  1.  This 
time,  we  can  show  that  for  any  n,  ga{n)  =  and  Go-(-/3(n)  =  Ga(n)  +  Gg{n), 

from  which  it  follows  that  G^o,(n)  =  This  means  that  if  a  is  represented  in  Cantor 

normal  form,  then  Ga(n)  is  the  result  of  replacing  u;  by  n  troughout  the  Cantor  normal 
form!  Thus,  we  have 

G,.|„,(n)  =  n" 

Definition  12.2  Inner  iteration  hierarchies. 

Again,  let  5:  N  — >  N  be  a  given  function.  The  hierarchy  (hn  )aei  is  defined  as  follows: 
For  all  n  G  N, 

ho(n)  =  n, 
hc,-i-i{n)  =  ha{g{n)), 
ha{n)  =  /ta[n](n), 

where  in  the  last  case,  a  is  a  limit  ordinal.  The  prime  example  of  this  kind  of  hierarchy  is 
the  Hardy  hierarchy  based  on  the  successor  function  g{Ti)  =  ??  +  1  (Hardy  [20]).  It 

is  easy  to  show  that  /iQ+^(n)  =  hQ{h0{n)),  and  so  h^^a+i{n)  =  h2,a{rt). 

Definition  12.3  Fast  iteration  hierarchies. 

Let  g':N  -+  N  be  a  given  increasing  function.  The  hierarchy  (/olagi  is  defined  as 
follows:  For  all  n  G  N, 

fo{n)  =  g{n), 

/a+i(n)  =  /:(n), 

/a(^)  /o[n](^)) 

where  the  nth  iterate  of  /„  applied  to  x,  and  in  the  last 

case,  a  is  a  limit  ordinal. 

The  prime  example  of  this  kind  of  hierarchy  is  the  extended  Grzegorczyk  hierarchy 
(Fo)aGl  based  on  the  successor  function  g{n)  =  n  +  1.  It  is  interesting  to  note  that 
Ackermann’s  function  has  rate  of  growth  roughly  equivalent  to  that  of  F^. 

It  is  not  difficult  to  show  that  faiji)  =  h^€.{n).  Thus,  even  though  the  fast-growing 
hierarchy  seems  to  grow  faster  than  the  inner  iteration  hierarchy,  the  h-hierarchy  actually 
“catches  up”  with  the  /-hierarchy  at  eo,  in  the  sense  that 

/6o(”  -  1)  <  <  feoin  +  1). 
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Given  two  functions  /,  gr;N  — >  N,  we  say  that  g  majorizes  f  (or  that  g  dominates 
/j  iff  there  is  some  G  N  such  that  g{n)  >  f{n)  for  all  n  >  k.  It  is  shown  in  Buchholz 
and  Wainer  [3]  that  majorizes  Fa  and  that  majorizes  Ha  if  /j  >  a-  This  property 
can  also  be  shown  for  the  slow-growing  hierarchy.  Buchholz  and  Wainer  [3]  also  show  that 
every  recursive  function  provably  total  in  Peano’s  arithmetic  is  majorized  by  some  in 

the  fast-growing  hierarchy  up  to  cq  ,  and  that  every  Fa  for  o  <  eo  is  recursive  and  provably 
total  in  PA.  It  follows  that  F^^  is  recursive,  but  not  provably  total  in  PA.  Going  back  to 
the  function  Fr  associated  with  Friedman  miniature  version  of  Kruskal’s  theorem  (theorem 
5.2),  Friedman  has  shown  that  Fr  majorizes  Fpo,  and  in  fact,  Fr  has  the  rate  of  growth  of 
a  function  Fa  where  a  is  considerably  larger  that  Fq! 

We  noted  that  the  h-hierarchy  catches  up  with  the  /-hierarchy  at  cq.  It  is  natural 
to  ask  whether  the  slow-growing  hierarchy  catches  up  with  the  fast-growing  hierarchy.  At 
first  glance,  one  might  be  skeptical  that  this  could  happen.  But  large  ordinals  are  tricky 
objects,  and  in  fact  there  is  an  ordinal  a  such  that  the  slow-growing  hierarchy  catches  up 
with  the  fast-growing  hierachy. 

Theorem  12.4  (Girard)  There  is  an  ordinal  a  such  that  Ga  and  Fa  have  the  same  rate 
of  growth,  in  the  sense  that 


Gain)  <  Fain)  <  Gaian  +  b), 
for  some  simple  linear  function  an  -f  6.  □ 

This  remarkable  result  was  first  proved  by  Girard  [17].  The  ordinal  a  for  which  Ga 
and  Fa  have  the  same  rate  of  growth  is  nonother  than  Howard’s  ordinal,  another  important 
ordinal  occurring  in  proof  theory.  Unfortunately,  we  are  not  equipped  to  describe  it,  even 
with  the  apparatus  of  the  normal  functions  <^(a,^).  Howard’s  ordinal  is  greater  than  Fq, 
and  it  is  denoted  by  where  is  the  least  uncountable  ordinal,  and  en+i  is 

the  least  e-number  after  Q  (so  en+i  =  ).  Alternate  proofs  of  this  result  are  given 

in  Cichon  and  Wainer  [4],  and  Wainer  [54]  (among  others).  A  fairly  simple  description  of 
Howard’s  ordinal  is  given  in  Pohlers  [41]. 

Before  closing  this  section,  we  cannot  resist  mentioning  Goodstein  sequences  [18], 
another  nice  illustration  of  the  representation  of  ordinals  less  than  eo  in  Cantor  normal 
form. 

Let  n  be  any  fixed  natural  number,  and  consider  any  natural  number  a  such  that 

■  ("+*)  1  , 

a  <(n +  !)'"+■>  }("+■), 
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We  express  a  in  complete  base  n  +  1  by  first  writing  a  =  mo  +  mi  (n  +  l)  +  .  .  .  +7721.(71  +  1)“*, 
where  m^  <  72,  and  a,  <  0,4-1,  and  then  recursively  writing  each  oi  in  complete  base  72  +  1, 
until  all  the  exponents  are  <  n.  Given  a,  denote  by  rep(a,n  +  l)  its  associated  representation 
in  complete  base  72  +  1.  Given  a  number  a  and  its  representation  rep(a,n  +  1),  we  denote 
by  shiftrep{a,n  +  1)  the  result  of  replacing  n  +  1  by  72  +  2  throughout  the  representation 
rep(a,n  +  1),  and  by  \shiftrep{a,n  +  1)[  the  numerical  value  of  this  new  term. 

Definition  12.5  The  Goodstein  sequence  starting  with  a  >  0  is  defined  as  follows.  Choose 
72  as  the  least  number  such  that 

("+i)  ) 

o<(n +  !)("+>> 

Set  oo  =  a  —  1,  and  —  |s/2?/t7-ep(ai. , 72  +  A’  +  1)|  —  1. 

In  the  above  definition,  a  —  b  is  the  usual  difference  between  a  and  b  when  a  >  b,  and 
it  is  equal  to  0  otherwise.  Thus,  we  obtain  01-4. 1  from  oi  by  changing  72  +  A-  +  1  to  77  +  A  +  2 
in  the  representation  rep{ak,n  +  A’  +  1)  of  ai-  and  subtracting  1  from  this  new  value. 

Theorem  12.6  (Goodstein,  Kirby  and  Paris)  Every  Goodstein  sequence  terminates,  that 
is,  there  is  some  k  such  that  =  0.  Furthermore,  the  function  Good  such  that  Good{a)  — 
the  least  k  such  that  ai-  =  0  is  recursive,  but  it  majorizes  the  function  from  the  Hardy 
Hierarchy. 

Proof .  The  proof  that  every  Goodstein  sequence  terminates  is  not  that  difficult.  The  trick 
is  to  associate  to  each  an  ordinal  oi  <  Co  obtained  by  replacing  n  +  A'  +  1  by  u;  throughout 
rep{ak ,  72  +  A  +  1).  Then,  it  is  easy  to  see  that  01:4.1  <  oi-,  and  thus,  the  sequence  cik  reaches 
0  for  some  A.  The  second  part  of  the  theorem  is  due  to  Kirby  and  Paris  [26].  Another 
relatively  simple  proof  appears  in  Buchholz  and  Wainer  [3].  □ 

Since  iffg  is  not  provably  recursive  in  PA,  Goodstein’s  theorem  is  a  statement  that  is 
true  but  not  provable  in  PA. 

Readers  interested  in  combinatorial  independence  results  are  advised  to  consult  the 
beautiful  book  on  Ramsey  theory,  by  Graham,  Rothschild,  and  Spencer  [19]. 


13  Constructive  Proofs  of  Higman’s  Lemma 

If  one  looks  closely  at  the  proof  of  Higman’s  lemma  (lemma  3.2),  one  notices  that  the  proof 
is  not  constructive  for  two  reasons: 

(1)  The  proof  proceeds  by  contradiction,  and  thus  it  is  not  intuitionistic. 
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(2)  The  definition  of  a  minimal  bad  sequence  is  heavily  impredicative,  as  it  involves  uni¬ 
versal  quantification  over  all  bad  sequences. 

Thus,  it  is  natural,  and  as  it  turns  out,  quite  challenging,  to  ask  whether  it  is  possible 
to  give  a  constructive  (and  predicative)  proof  of  Higman’s  lemma. 

In  a  remarkable  (and  short)  paper,  Friedman  [15]  introduces  a  new  and  simple  tech¬ 
nique,  the  A-translation ,  which  enables  him  to  give  simple  proofs  of  the  fact  that  first-order 
classical  Peano  arithmetic  and  classical  higher-order  arithmetic  are  conservative  over  their 
respective  intuitionistic  version  over  Il^-sentences.  His  technique  also  yields  closure  un¬ 
der  Markov’s  rule  for  several  intuitionistic  versions  of  arithmetic  (if  is  provable, 

then  3xip  is  also  provable,  where  2:  is  a  numeric  variable,  and  is  a  primitive  recursive 
relation).  Using  Friedman’s  j4-translation  technique,  it  follows  that  there  is  an  intuition¬ 
istic  impredicative  proof  of  Higman’s  lemma.  However,  it  would  still  be  interesting  to 
see  whether  a  constructive  (predicative)  proof  can  be  extracted  directly  from  the  classical 
proof,  and  Gabriel  Stolzenberg  was  among  the  first  researchers  to  propose  this  challenge, 
and  eventually  solve  it.  It  turns  out  that  (at  least)  two  constructive  (predicative)  proofs  of 
a  constructive  version  of  Higman’s  lemma  have  been  given  independently  by  Richman  and 
Stolzenberg  [45],  and  Murthy  and  Russell  [35].  Steve  Simpson  has  proven  a  related  result 
for  the  Hilbert’s  basis  theorem  [49],  and  his  proof  technique  seems  related  to  some  of  the 
techniques  of  Richman  and  Stolzenberg.  The  significance  of  having  a  constructive  proof  is 
that  one  gets  an  algorithm  which,  given  a  constructively  (and  finitely  presented)  infinite 
sequence,  yields  the  lefmost  pair  of  embedded  strings.  Murthy  and  Russell  [35]  do  extract 
such  an  algorithm  using  the  NuPRL  proof  development  system.  The  next  challenge  is  to 
find  a  constructive  proof  of  Kruskal’s  theorem. 
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